Stateless Isn’t Always the Answer. Building Stateful Actor Services on AWS.
Building stateless and stateful services can be challenging. Actor model and actor frameworks offer a great way of implementing reliable and highly scalable stateful services on AWS.
- When the service needs to support a large number of devices and process the expected high volume of events, it needs to scale out and add many instances, along with a load balancer to distribute the requests to all instances. How will the load balancers know which instance has the right device data for it to send requests to?
- If all instances are designed to have a copy of the overall state, how do we synchronize the changes from one instance to all other instances and keep the state data consistent?
- When an instance crashes, how do we ensure the device data on that instance won’t be lost?
- When our system launches new instances, either to replace crashed ones or as part of blue-green deployment, how will the new instance know the current status of devices so it can start processing requests?
Any data that needs to persist must be stored in a stateful backing service, typically a database
counters
to store the value, and create the service as simple Lambda function:1
2
3
4
5
6
7
8
9
10
11
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('counters')
def lamnbda_handler(event, context):
response = table.get_item(Key={'id': event['deviceId']})
item = response['item']
item['current_count'] = item['current_count'] + 1;
table.put_item(Item=item)
return item
- Messages and Mailbox: Actors only communicate to each other by sending and receiving certain messages. Messages are sent asynchronously to each actor, so they need to be stored in an mailbox of the recipient and wait for their turn to be processed by the actor. Actors process messages one at a time, in the order they arrive.
- Cluster: A cluster is a group of nodes that work together to host actors. In a microservice environment, each node is an instance of a microservice that hosts the actor runtime environment and manages the lifecycle of actors. Actors are distributed across the nodes, often based on the load and resource availability of each node.
- Remote and Addressing: Actors may live on different nodes and machines in a cluster. An actor system must provide a remote protocol to pass the messages between actors on different nodes while ensuring the reliability and security of the message. Actors identify each other and the recipient of messages through addressing. Each actor has a unique address that consists of its name and location. An actor system will recognize the address of the recipient of each message and route it to the right node and mailbox.
- Persistence: In many cases, actors need to save their state to a persistent storage. They are called ‘persisted actors’. When a persisted actor crashes for any reason, the actor can be restarted and then resume its state from the last saved snapshot.
1
2
3
message IotDeviceEvent {
string DeviceId = 1;
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
internal class DeviceActor: IActor
{
private int _count;
public Task ReceiveAsync(IContext context)
{
var msg = context.Message;
switch (msg)
{
case IoTDeviceEvent evt:
_count ++;
break;
default:
break;
}
return Task.CompletedTask;
}
}
_count
) in memory. There is no concurrency concern and no need to lock the _count
state data. The code is as simple as the single-threaded program we had in the old days.1
2
3
4
var system = new ActorSystem();
var props = new Props()
.WithProducer(()=> new DeviceActor());
.WithSpawner(Props.DefaultSpawner);
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
IClusterProvider k8sProvider =
new KubernetesProvider(
new Kubernetes(KubernetesClientConfiguration.InClusterConfig())
);
var remoteConfig = GrpcNetRemoteConfig.BindTo(ip, port);
var clusterConfig = ClusterConfig
.Setup(
clusterName: _CLUSTER_NAME,
clusterProvider: k8sProvider,
identityLookup: new PartitionIdentityLookup())
.WithClusterKind(
kind: "device",
prop: props );
await system.WithRemote(remoteConfig)
.WithCluster(clusterConfig);
.Cluster()
.StartMemberAsync();
Persistence
module, such as Proto.Persistence. Using the persistence module, our actors can persist their state through event sourcing, snapshotting, or both._persistence.PersistEventAsync(<event>)
for each state change. Both events and snapshots are persisted with the actor identifier and will only be retrieved by the actor with the same identifier during recovery.1
2
3
4
5
6
7
8
9
10
11
12
13
14
var client = new AmazonDynamoDBClient();
// Set options - you can replace table names.
var options = new DynamoDBProviderOptions("events", "snapshots");
// Optionally: Check/Create tables automatically.
// Those 1s at the end are just initial read/write capacities.
// If you don't need snapshots/events don't create that table.
// If not you have to manually create tables!
//await DynamoDBHelper.CheckCreateEventsTable(client, options, 1, 1);
//await DynamoDBHelper.CheckCreateSnapshotsTable(client, options, 1, 1);
// Initialize provider and use it for your persistent Actors.
var provider = new DynamoDBProvider(client, options);
- Actor model - Wikipedia: This article gives an overview of the history, fundamental concepts, applications, and programming languages of the actor model of concurrent computation. It also lists various programming languages that employ the Actor model, as well as libraries and frameworks that permit actor-style programming in languages that don't have actors built-in.
- Actors: A Model of Concurrent Computation in Distributed Systems: This thesis by Gul Agha provides both a syntactic definition and a denotational model of Hewitt's actor paradigm, explaining how the actor model addresses some central issues in distributed computing.
- How the Actor Model Meets the Needs of Modern, Distributed Systems: Akka is a popular framework based on Actor model and built on the principles outlined in the Reactive Manifesto. This guide explains how the use of the actor model can overcome challenges in building modern, distributed systems.
Any opinions in this post are those of the individual author and may not reflect the opinions of AWS.