📝CRDT

Conflict-free replicated data type

CRDT is a data type that can be replicated across different nodes and be modified concurrently and independently. All inconsistencies are guaranteed to resolve.

Two main classes of CRDTs are:

  • CmRDT (commutative), operations-based
    • Operations are transferred and re-applied on the other side.
    • Limitations:
      • require exactly once delivery
      • it’s hard to do GC
      • require operations to be executed individually on every node (even if they could be batched)
    • CmRDT naturally allow inspecting the history
  • CvRDT (convergent), state-based
    • State is transferred and merged.
    • Limitations:
      • high overhead from shipping state
  • With the broad definition, both CmRDT and CvRDT are expressible through one another (not possible with pure op-based CRDT).
    • CvRDT -> CmRDT: imagine the state being a log of all operations applied — now you have a CmRDT.
    • CmRDT -> CvRDT: let each operation be “apply this state.” This effectively is a CvRDT.
  • Delta-based
    • a hybrid of CmRDT and CvRDT: store as state but send delta operations on syncing
  • There are also Pure Operation-Based Replicated Data Types, a stricter definition of operations-based CRDTs.
    • CmRDT prepare an effector from operation and current state, which allows using state as an effector. Pure op-based CRDT prohibit this and require effector to be created without access to the state.

Typical data structures:

  • G-Set—grow-only set.
  • 2P-Set—two-phase set. Consists of two G-Sets: added and removed elements. Allows removing elements, but does not allow re-adding them.
  • C-Set—Counter Set. A count is stored for each element. Might be called PN-Set (because it stores a PN-Counter).
  • OR-Set—Observed Remove Set. An element can only be removed if remove operation is caused by add operation (i.e., happens before). I believe this requires vector clocks.
  • LWW-Register—last write wins. Timestamp (logical) + site id (to break ties) + value
  • LWW-Set—like LWW-Register, but for inclusion of an element. Two flavors are Add-Wins (AWSet) and Remove-Wins (RWSet), which determine what happens if an element is added and removed concurrently.
  • MV-Register—Multi-Value Register. In case of concurrent writes, all values are stored in a set.
  • G-Counter—grow-only counter.
  • PN-Counter—positive-negative counter. Store two G-Counters (for increment and decrement).

Projects

See also

Resources

Backlinks