I recently came across a new paper co-authored by John Ousterhout, one of the original authors of the Raft protocol. In it John, and his co-author, describe an approach which can double the throughput of some popular replicated distributed key-value stores.
On the surface it’s a rather simple idea — if the order of a set of operations doesn’t matter, then you can send them to the store in any order, the store will end up in the same state. A client can use this fact to send a set of commutative operations to the store, and to a second set of storage nodes, known as witnesses.
Critically the client can do all of this concurrently and also not wait for the store to replicate the data. Since one of the biggest factors determining the throughput of a replicated store is waiting for the replica to acknowledge reception of the data, not having to wait for the replica can mean much higher throughput for the client. This is exactly what the paper shows — under certain conditions.
The paper is quite readable, and is worth reviewing. The biggest question I have is how effective it will be in practice. If one’s workload has a significant fraction of non-commutative accesses, it won’t be as effective. And differentiating between commutative and non-commutative operations may not be trivial for a given application.