Conflict-free replicated data type
CRDT is a data type that can be replicated across different nodes and be modified concurrently and independently. All inconsistencies are guaranteed to resolve.
Two main classes of CRDTs are:
CmRDT (commutative), operations-based
Operations are transferred and re-applied on the other side.
require exactly once delivery
it’s hard to do GC
require operations to be executed individually on every node (even if they could be batched)
CvRDT (convergent), state-based
State is transferred and merged.
high overhead from shipping state
With the broad definition, both CmRDT and CvRDT are expressible through one another (not possible with pure op-based CRDT).
CvRDT -> CmRDT: imagine the state being a log of all operations applied — now you have a CmRDT.
CmRDT -> CvRDT: let each operation be “apply this state.” This effectively is a CvRDT.
a hybrid of CmRDT and CvRDT: store as state but send delta operations on syncing
There are also Pure Operation-Based Replicated Data Types, a stricter definition of operations-based CRDTs.
CmRDT prepare an effector from operation and current state, which allows using state as an effector. Pure op-based CRDT prohibit this and require effector to be created without access to the state.
Typical data structures:
2P-Set—two-phase set. Consists of two G-Sets: added and removed elements. Allows removing elements, but does not allow re-adding them.
C-Set—Counter Set. A count is stored for each element. Might be called PN-Set (because it stores a PN-Counter).
OR-Set—Observed Remove Set. An element can only be removed if remove operation is caused by add operation (i.e., happens before). I believe this requires vector clocks.
LWW-Register—last write wins. Timestamp (logical) + site id (to break ties) + value
LWW-Set—like LWW-Register, but for inclusion of an element. Two flavors are Add-Wins (AWSet) and Remove-Wins (RWSet), which determine what happens if an element is added and removed concurrently.
MV-Register—Multi-Value Register. In case of concurrent writes, all values are stored in a set.
PN-Counter—positive-negative counter. Store two G-Counters (for increment and decrement).
Yjs a js framework for CRDT editing of different data types.
CRDTs are a good fit for Local-First Software as suggested in Local-first software: You own your data, in spite of the cloud
CRDT: Conflict-free Replicated Data Types - Anton Zagorskii - Medium - Good overview of CRDTs
Treedoc: allows concurrently editing a buffer (basically, any sequence of non-modifiable atoms).
LSEQ is a Treedoc alternative.
Briquemont…2015—CRDTs that allow replicas to only hold parts of the data.