Next.js App Router + React Server Components Demo

new
past
show
ask
show
jobs
submit

▲How Akka Cluster Works: Actors Living in a Cluster (lightbend.com)

109 points by andersson42 1190 days ago | 62 comments

halfmatthalfcat 1189 days ago [-]

I use Akka Cluster extensively with Persistence. It's an amazing piece of technology.

Before I went this route, I tried to make Akka Cluster work with RabbitMQ however I realized (like another poster here) that you're essentially duplicating concerns since Akka itself is a message queue. There's also a ton of logistics with Rabbit around binding queues, architecting your route patterns, etc that add extra cognitive overhead.

I'm creating a highly distributed chat application where each user has their own persistent actor and each chatroom has their own persistent actor. At this point, it doesn't matter where the user or chatroom are in the cluster it literally "just works".

All I need to do is emit a message to the cluster from a user to chatroom or vice versa, even in a cluster of hundreds of nodes, and things just work. Now there's some extra care you need to take at the edge (split-brain via multi-az, multi-datacenter) but those are things you worry about at scale.

Akka is the real fucking deal and it's one of the most pleasurable application frameworks I've ever had the pleasure of using in my career.

edit: The only reason I'd ever want to use Rabbit again is if I had external clients that needed to hook up to our message bus. If you're creating an entirely internal system, Akka Cluster is absolutely the way to go.

acjohnson55 1189 days ago [-]

I'd echo this.

I worked for a company that made a real-time auction system, based on Akka. It's been frustrating in the years since to program on less powerful foundations.

If I were building a system that combined interactive and autonomous processes, I would absolutely reach for Akka again. The one thing I'd love to see is if they could build something like https://temporal.io on top of Akka. I think it would be complementary to the state machine style model of typed actors and the pipeline model of Akka Streams.

playing_colours 1189 days ago [-]

I remember using Akka in the domain of IoT. A persistent actor represented a sensor: state and history of readings. There was a great feature in Akka Persistence: if an actor went idle for a while - no new sensor data, it would be offloaded from memory to the storage (Cassandra). As soon as the sensor started to send new signals again or the sensor state was queried by a user the actor got loaded back to memory.

vinay_ys 1189 days ago [-]

How much sensor data is kept in memory in the actor object? What's the cost tradeoff between memory vs an SSD? I wonder if SSD based solution would still be cheaper and more scalable than live memory objects based solution.

playing_colours 1189 days ago [-]

What sensor data and how much depend on the access needs: you can keep N latest records or some rolling aggregated data, if they are frequently accessed by some rule engine. It's certainly not feasible to hold big chunks of sensor data in memory.

amelius 1189 days ago [-]

Could one implement a distributed filesystem using Akka?

halfmatthalfcat 1189 days ago [-]

You could but I don't think it would be the best tool for it. When I think of Akka, I'm using it because I don't want to worry about which node in my cluster any given actor is on, I just want to be able to scale horizontally and messages are routed appropriately.

Akka embraces the "let it fail" mentality where as nodes go down (just as pods go down in Kuberentes), you don't have to worry about where your processes are running...they just are...somewhere.

dragosmocrii 1189 days ago [-]

Sounds like Akka and Erlang/Elixir have a lot in common

hawk_ 1189 days ago [-]

Yes Akka is essentially erlang's actor model for the JVM

hugofirth 1189 days ago [-]

To offer a slightly dissenting opinion, we’ve had many issues with Akka over the years:

- if you roll your cluster membership a lot the dotted version vectors which are created by Akka distributed data grow unbounded. Eventually they will start making gossip messages exceed the default maximum size (a few kB IIRC) and fail to send.

- in the presence of heavy GC Akka cluster has a really bad time. Members will flip flop in marking each other unavailable. Eventually this will render the leader unable to perform its duties and you will struggle to (for example) allow a previously downed member to rejoin the cluster.

- orderly actor system shutdown will also fail under high GC, which is problematic as sometimes you need to restart your actor system.

- split-brain resolution is really really hard to get right. The Akka team have recently made theirs open source I believe which is good, but back when we were building with Akka cluster it required a Lightbend subscription.

- If you aren’t all in on Actors, the integration point between Akka and the rest of your codebase can be a little odd. You often feel like you should reach for `Patterns.ask` (a way of sending a message to an actor and then getting a Future back which will complete on a particular response) but then people tell you that’s an Anti pattern.

————

Having said all the above, if you’re able to go all in on the Actor pattern and you’re unlikely to hit high GC then you should give Akka cluster a try. The problems it tackles are genuinely hard and you should build on their hard work if you can. In particular they offer (in distributed-data) the most robust/complete set of CRDTs I’ve yet come across. Many other CRDT libraries expect you to bring your own gossip protocol and transport layer.

cutemonster 1189 days ago [-]

How would you compare Akka with Erlang? Or is that a weird thing to ask here

What made you choose Akka

hugofirth 1189 days ago [-]

We're a big Java/Scala shop already, so Akka was easier to integrate :)

kdps 1189 days ago [-]

Regarding garbage collection: do you think ZGC or Shenandoah could reduce the problems you mentioned?

hugofirth 1189 days ago [-]

I believe it would have alleviated some of the issues yes. In general I’m excited for the benefits “pauseless” GC can bring to soft real-time systems on the JVM. Unfortunately, for now we have to continue supporting G1 and friends.

tunesmith 1190 days ago [-]

In scala-land, a lot of people like to scoff at Akka because they prefer other pure fp concepts, but I don't think they've found a replacement for Akka Cluster - where you need objects that have both state and behavior, meaning they need to exist in memory, and where there are too many to exist on one server.

If you don't need behavior, you can use things like distributed databases or caches, and if you don't need to scale out, there are other pure fp solutions. But for this kind of distributed behavior, it still seems to me that Akka Cluster is the killer app.

darksaints 1189 days ago [-]

I don't think the scala crowd scoffs at Akka because they are beholden to pure functional programming. The pure FP crowd is actually a minority...significant enough to acknowledge, but not enough to make or break anything about the community.

The real problem with Akka is that, at least until very recently, you had to abandon any semblance of type safety if you used it. That was very frustrating to work with. I can take or leave pure FP, but you can pry my strong static typing from my cold dead hands.

AzzieElbab 1189 days ago [-]

I wouldn’t say all type safety. It just couldn’t prevent you from sending unhandled messages to actors

alextheparrot 1189 days ago [-]

I wonder what it would look like trying to implement something like the IO monad backed by Akka actors. Can't recall anything off the top of my head that would make that untenable aside from the aforementioned scoffing.

thelittlenag 1189 days ago [-]

Take a look at ZIO Actors (https://zio.github.io/zio-actors/).

alextheparrot 1189 days ago [-]

Thanks for the pointer, wasn't aware ZIO had an actor implementation. My interest was the inversion, though, which is “Could you make a (more or less) drop-in replacement for Future/IO/Task/ZIO that implements not just concurrency, but distributed concurrency using actors” (Where the actors could be Akka, ZIO, etc.).

AzzieElbab 1189 days ago [-]

I have seen several such implementations. They are fairly robust although feel a bit awkward when compared to streams. Scala Fs2 and zstreams are just too good once you figure them out. Neither provide clustering of course

tormeh 1189 days ago [-]

The actor model is brilliant, but I fear it will never get the adoption it deserves because it's too much of a break from convention. I guess it's maybe a bit too much of a leap? It replaces both Kubernetes and messaging queues, so once you've made software with it it's kinda hard to back out to a vanilla programming model.

There's of course the legit downside of needing to use one language and one framework for all actors, which is a problem Kubernetes and a message queues don't have.

halfmatthalfcat 1189 days ago [-]

It doesn’t replace Kubernetes, it is the ultimate companion.

Kubernetes deals with the OS/node level failures, the actor system deals with the application level failures.

It’s actually amazing how complementary they are.

lostcolony 1189 days ago [-]

'legit downside of needing to use one language and one framework for all actors' - which is why a lot of Erlang users use it for coordination of messages, and delegate their handling to other services as needed.

That said, most work places I've been at, leadership has -wanted- to use one language. Even with containers and other decoupling technologies. So I don't know how much of a negative effect that downside has.

blandflakes 1189 days ago [-]

I've sort of come around in my career to aggressively simplifying the stack where possible... it's handy to be able to script stuff but I don't actually enjoy running a polyglot team. We get a lot more done with one language, one build tool, etc. I tend to save other languages for niche applications (e.g. Lua for nginx scripting).

dragonwriter 1189 days ago [-]

> There's of course the legit downside of needing to use one language and one framework for all actors

That's not inherent in the actor model, it's a potential artifact of some implementations. Though I think one VM is more common. E.g., BEAM instead of just Erlang, or JVM for Akka.

aaronmill1 1189 days ago [-]

How would the actor model replace Kubernetes?

acjohnson55 1189 days ago [-]

The way I see it, Kubernetes (with some type of message bus) is in some ways a system for deploying processes that will carry out process, kind of like actors. It doesn't use the same formalisms, but functionally, it's quite similar. Or, you could say that Akka Cluster is like Kubernetes, except that every process is contained within an Actor and has to be written in Java or Scala.

jpcooper 1190 days ago [-]

Carl Hewitt, the inventor of the actor model, wrote this paper: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3418003, which he posted a link to on a recent post here on Erlang.

In it, he claims that Godel's Incompleteness Theorem is not true, and that the actor model is more general than the Turing machine. I am open to entertaining the idea.

I've seen that his ideas have been discredited elsewhere on HN. I would be interested to know people's opinions on this, as a lot of the paper went over my head.

hilbertseries 1190 days ago [-]

> he claims that Godel's Incompleteness Theorem is not true, and that the actor model is more general than the Turing machine

He is certainly wrong about the incompleteness theorems. And it’s entirely possible to create a model of computation that is more general than Turings. The question is whether it better represents what’s computable, the abstract mentions computations involving an “infinite number of computations” between steps...

jpcooper 1189 days ago [-]

How is he wrong about GIT?

What more general systems of computation are there than Turing? Are any in use? Are they really more general?

The claims of the generality of Actors seem to rely on continuous time and non-determinism. Actors, determinism, non-determinism, concurrency and the completeness axiom are models which we can use to express computation and our surroundings, and nothing more.

One man says lambda, another says actor. Given our models of physics; given Planck and Heisenberg, are they really different? If so, how? Measure theory rests on the completeness axiom, but it is just a very useful axiom.

Am I missing something?

macintux 1189 days ago [-]

Dr. Hewitt often pops up to discuss Erlang and the actor model here, so you might have the opportunity to ask him for more details.

1189 days ago [-]

29athrowaway 1190 days ago [-]

I used Akka cluster many years ago.

One of the problems I encountered was that to finding actors in remote actor systems. i.e.: you have a unique actor responsible for X, living somewhere in your cluster and you need to know its name as well as the IP address of the actor system where it is running.

A message queue solves this problem, but that was not the approach I took.

My solution was to implement an actor discovery system on top of Zookeeper. Using that, I could have a cluster-wide unique actors.

halfmatthalfcat 1189 days ago [-]

They have Cluster Singletons that alleviate this concern now. You don't need to know their physical location in the cluster. All you need to do is ask the cluster for a pointer to the Singleton and you can message it directly.

29athrowaway 1189 days ago [-]

That's cool. I wish I had this in 2014.

acjohnson55 1189 days ago [-]

I used Cluster Singleton with much success, starting at the very beginning of 2016: https://github.com/artsy/atomic-store/

29athrowaway 1189 days ago [-]

Are you still using Akka these days?

acjohnson55 1187 days ago [-]

Sorry for the delay!

Nope. The two companies I have since worked have mature products that aren't built on the JVM or Scala. But I'd absolutely consider it if I were to build a system with similar constraints again.

There was a big learning curve, but the end result was a system that was built largely correctly, on a tight timeline, solving some tricky technical and business challenges.

playing_colours 1189 days ago [-]

Actor model may be a handy tool to build neural networks. I have not tried it yet in Akka or Erlang, but there is a whole book about it: https://www.springer.com/gp/book/9781461444626

t-writescode 1190 days ago [-]

I've written a high-availability service with Akka.NET and RabbitMQ and I remember when I was working with that infrastructure, my biggest question around Akka Cluster was "why would I use this when I already have a message queue infrastructure?"

Maybe real Akka is better than Akka.NET when it comes to Akka Cluster?

valenterry 1189 days ago [-]

Akka Cluster works in-memory, RabbitMQ doesn't.

Say you want to have multiple actors (one per user / customer or whatever) and you get HTTP requests and want that exactly this actor handles them (to guarantee consistency), then you can't really do this with RabbitMQ.

I mean, you can make the machine that receives the request push it to the queue and keep the http connection alive, have the machine that is responsible for the user read it from a queue and then somehow tell the first machine how to respond the http request... but then you pretty much re-implemented Akka Cluster in a worse way.

Persistent queues and Akka Cluster solve different usecases.

t-writescode 1189 days ago [-]

At that point, you're still operating on a single machine, though, and you don't need Akka Cluster for that.

valenterry 1189 days ago [-]

I don't understand what you mean.

In my scenario, multiple machines are used and necessary for the same service (otherwise there is no point in using Akka Cluster).

Example: online games. You have room/game being created on the fly and it is destroyed an hour later. There are many of these rooms and they are distributed over multiple machines. You can't really use rabbitMQ here, it's not performant enough.

And even if you did, you would pretty much reimplement what Akka Cluster does for you: instance synchronization, different strategies for handling split brain scenarios, dead letter handling, direct message forwarding, persistence (if needed) and so on

vinay_ys 1189 days ago [-]

If you want to host a game state object in memory (vs serialized and saved to SSD) because you have lots of frequent write/reads happening to that object in a very short window of time such that the IO cost and CPU cost (ser/deserialisation) is higher and the incremental latency is a blocker, then this design of hosting a full-blown object in memory within your runtime makes sense (number of reads per game object per second must be high and must be sustained for a good number of seconds for this tradeoff math to be in this design's favor, given today's SSD costs vs memory costs).

But I wonder if you will suffer from random GC pauses, inability to carefully isolate different behaviors into different resource clusters, resulting in uncontrolled blast radius etc.

If you are anyway doing persistence (because you care to not lose game progress), and whenever a cluster node dies you need to resurrect game state from persistence, I wonder if you will get the game state restored within a bounded latency.

If this happens frequently enough (to affect say 5% of your users – enough to kill your game experience), is the benefits of latency gain from in-memory object reads wiped out?

valenterry 1189 days ago [-]

I mean, you are right that Akka Cluster is JVM based and hence can bring the problems you mentioned. But then again, most high frequency trading also runs on the JVM, so it can often be worked around.

> If this happens frequently enough (to affect say 5% of your users – enough to kill your game experience), is the benefits of latency gain from in-memory object reads wiped out?

For this specific use-case, I don't think there is really an alternative, except for specifically a hand-crafted system (or non-scale, such as everyone hosts and manages their own server).

vinay_ys 1189 days ago [-]

> But then again, most high frequency trading also runs on the JVM, so it can often be worked around.

JVM is an awesome piece of technology. And you can do robotic control systems to high-frequency trading systems with it with careful programming.

But I've seen a lot of Java code running in production suffering from latency jitters and needing continuous profiling and optimization by a small group of performance engineers while the majority of application engineers keep adding to GC load.

> For this specific use-case, I don't think there is really an alternative, except for specifically a hand-crafted system

Yes, but I think the handcrafted system doesn't need to be very complex. It can be quite simple and easy to understand and tame to your needs as your scale and complexity grows.

randomopining 1188 days ago [-]

Doesn't it have the new GC's that are supposed to be minimal pause and gamechangers?

valenterry 1187 days ago [-]

Yes, but for GC still must happen and sometimes even milliseconds can be a problem, depending on the use-case. It really depends on what your business does, so the OPs concerns with the JVM are very valid.

At the same time, I think (while not being an expert) that in the majority of the cases, the GC will not be a problem and that the time you can save from using Akka Cluster allows you to optimize your system more than enough to make up for any GC latency problems, in almost every system.

The only technology that might be better/comparable here is the erlang VM, but I have never used it myself.

t-writescode 1189 days ago [-]

You're not just talking online games, you're talking enormous-world MMOs, and that's such a far cry from the areas that I have worked with and thought about how to work with in any detail that I can't usefully add anything.

If you're trying to manage the concurrent state of 10s of thousands of players in a game, all server-side, and you want specific actors to handle that single player and you don't want a globally persistent state, then I suppose this makes sense.

I've never worked with anything of that scale though.

valenterry 1189 days ago [-]

> You're not just talking online games, you're talking enormous-world MMOs

I am talking about bigger scala here of course. But not necessarily what you describe.

Take games like League of Legends or Counterstrike as examples. Having one game per actor seems like a sensible design to me.

But yeah, I think that traditional techniques still get you very far. I heard that Slack was running just on (multiple) postgres for the longest time.

alexis2b 1189 days ago [-]

Interesting! Not trying to divert from Akka which is a wonderful piece of engineering, but your comment reminded me of Microsoft’s take at the Actor model with Project Orléans - which was used for the backend side of the Halo / XBox MMO [0].

I think the GA version of Project Orléans is now called Service Fabric, although I never had the pleasure to try it.

[0] https://youtu.be/I91ZU8tEJkU

reubenbond 1186 days ago [-]

Orleans and Service Fabric are different. Orleans has been running in production for some time now and is actively developed on GitHub: https://github.com/dotnet/orleans. Teams inside Microsoft run it on top of Service Fabric (and Kubernetes, etc.) More details in this talk: https://youtu.be/KhgYlvGLv9c

Service Fabric has something called Reliable Actors which are heavily inspired by Orleans.

Source: I'm the project lead for Orleans

29athrowaway 1190 days ago [-]

Queues overlap with the messaging aspects of actors but not the supervision aspect of actors.

killingtime74 1190 days ago [-]

For that we have cluster management like kubernetes

29athrowaway 1190 days ago [-]

An actor can fail for reasons other than infrastructure issues. i.e.: unhandled exceptions.

In fact, unhandled exceptions are encouraged (i.e.: "let it crash" approach to fault tolerance).

t-writescode 1190 days ago [-]

Sure, but that’s almost entirely local within a process, where regular Akka processes would work.

So, in the system I worked with, individual applications were Akka powered and cross-process/cross-vm communication was done through queues

halfmatthalfcat 1189 days ago [-]

With Cluster + Sharding you can have zero(ish) downtime though when you scale horizontally. Messages sent to Sharded actors are buffered if their nodes ever go down and things will just resume as normal.

t-writescode 1189 days ago [-]

You can have that with a queueing system, too. As long as you don't ack the message before you're done processing it, the messages that are processing will be sent to the next available client on crash

halfmatthalfcat 1189 days ago [-]

Sure but the point is it's built into Akka. If you're looking to replace an external queue system, Akka will replace most if not all of it's functionality.

tunesmith 1189 days ago [-]

What if you have a message that is specifically for another actor that may or may not be on the same server?

t-writescode 1189 days ago [-]

That doesn't exist in the models I've designed. Microservices still exist and they each do their requisite task, but different work is done by different actors in the actor cluster. Actors are still a fantastic way of handling concurrency, since they can each be treated as single-threaded tiny programs that only think about themselves, and so I've used them in that way.

The actor just won't exist elsewhere, but another microservice that happens to use actors might exist. It would be sent a web request or a message in queue or similar.

1189 days ago [-]

d3ntb3ev1l 1190 days ago [-]

Wait, what, are you saying Akka Clusters “works”

Rendered at 11:31:28 GMT+0000 (Coordinated Universal Time) with Vercel.