Replicated State Machines

Understand the heart of most fault tolerant distributed systems
🪦 End of Life. This note is no longer maintained, and is scheduled for deletion.

Replicated state machines are a generalized approach for building fault-tolerant services with distributed systems. If you are familiar with the Gang of Four Design Patterns book, it is worth noting that the State Machine pattern described in the book has nothing to do with the way you would go about implementing a Replicated State Machine.

A Replicated State Machine is a software component that:

  • Encodes its states internally
  • Accepts commands that address a single method
  • Deterministic methods process each command
  • Internally, the commands mutate state and/or produce events.

If we have multiple Replicated State Machines that strictly follow the above rules, we can be certain that if they all start in the same initial state and they all process the commands in the same order, then they will have the same output events and the same internal state. The challenge then becomes how to get different machines to agree on the order of commands all while receiving concurrent client requests and in spite of failures.

Additional features that may be found with Replicated State Machines:

  • Snapshots - this allows a state machine internal state to be captured at a moment in time
  • Set - this allows the state machines internal state to be set to a given value.

These two additional features are important for the runtime operational aspects of a Replicated State Machine. If you wanted to capture the state of a state machine without needing to replay every command it had ever received since the initial state, you could take a snapshot of the internal state, save that to some place, and truncate every input command up-to-and-including the snapshot command. Then, when you want the state machine to recover it's internal state, you would set the internal state to the value produced by the snapshot.

Sample State Machine

public class ReplicatedStateMachine
    private int currentValue = 0;
    private List<EventListener> eventListeners = new ArrayList<>();

    public void addListener(EventListener eventListener)

    public void add(AddCommand addCommand)
        currentValue += addCommand.value;

    public void multiply(MultiplyCommand multiplyCommand)
        currentValue *= multiplyCommand.value;

    public void set(SetCommand setCommand)
        currentValue = setCommand.value;

    public void snapshot(SnapshotCommand snapshotCommand)

    private void notifyListeners()
        final NewValueEvent newValueEvent = new NewValueEvent();
        newValueEvent.currentValue = currentValue;
        for (final EventListener eventListener : eventListeners)

For sake of simplicity, let's assume that the EventListener just logs the current state:

public class EventListener
    private final Logger logger = LoggerFactory.getLogger(EventListener.class);

    public void newValue(NewValueEvent event)
    {"Current Value = {}", event.currentValue);

If we then run following the commands:

final EventListener eventListener = new EventListener();
final SimpleStateMachine bsm = new SimpleStateMachine();

final AddCommand add = new AddCommand();
add.value = 7;

final MultiplyCommand multiply = new MultiplyCommand();
multiply.value = 6;

final SetCommand set = new SetCommand();
set.value = 5;


We can be certain for every scenario in which we run the set, followed by the multiply, followed by the add then the internal value will be 37. If we changed the order to set, then add, then multiply the internal value would be 72.


F. B. Schneider, “Implementing fault-tolerant services using the state machine approach: a tutorial,” ACM Comput. Surv., vol. 22, no. 4, pp. 299–319, Dec. 1990, doi: 10.1145/98163.98167

Change log

  • Added 12 December 2020
reading time
3 min read
last updated
review policy
Distributed Systems
--- Views

© 2009-2023 Shaun Laurens. All Rights Reserved.