đź“ťCRDT
Conflict-free replicated data type
CRDT is a data type that can be replicated across different nodes and be modified concurrently and independently. All inconsistencies are guaranteed to resolve.
Two main classes of CRDTs are:
CmRDT (commutative), operations-based
Operations are transferred and re-applied on the other side.
Limitations:
require exactly once delivery
it’s hard to do GC
require operations to be executed individually on every node (even if they could be batched)
CvRDT (convergent), state-based
State is transferred and merged.
Limitations:
high overhead from shipping state
With the broad definition, both CmRDT and CvRDT are expressible through one another (not possible with pure op-based CRDT).
CvRDT -> CmRDT: imagine the state being a log of all operations applied — now you have a CmRDT.
CmRDT -> CvRDT: let each operation be “apply this state.” This effectively is a CvRDT.
Delta-based
a hybrid of CmRDT and CvRDT: store as state but send delta operations on syncing
There are also Pure Operation-Based Replicated Data Types, a stricter definition of operations-based CRDTs.
CmRDT prepare an effector from operation and current state, which allows using state as an effector. Pure op-based CRDT prohibit this and require effector to be created without access to the state.
Typical data structures:
G-Set—grow-only set.
2P-Set—two-phase set. Consists of two G-Sets: added and removed elements. Allows removing elements, but does not allow re-adding them.
C-Set—Counter Set. A count is stored for each element. Might be called PN-Set (because it stores a PN-Counter).
OR-Set—Observed Remove Set. An element can only be removed if remove operation is caused by add operation (i.e., happens before). I believe this requires vector clocks.
LWW-Register—last write wins. Timestamp (logical) + site id (to break ties) + value
LWW-Set—like LWW-Register, but for inclusion of an element. Two flavors are Add-Wins (AWSet) and Remove-Wins (RWSet), which determine what happens if an element is added and removed concurrently.
MV-Register—Multi-Value Register. In case of concurrent writes, all values are stored in a set.
G-Counter—grow-only counter.
PN-Counter—positive-negative counter. Store two G-Counters (for increment and decrement).
Projects
Yjs a js framework for CRDT editing of different data types.
[gritzko/swarm: JavaScript replicated model](https://github.com/gritzko/swarm) and Replicated Object Notation — RON
[smoothers/cause — An EDN-like CRDT (Causal Tree) for Clojure & ClojureScript that automatically tracks history and resolves conflicts. | from Causal Tree]
See also
CRDTs are a good fit for Local-First Software as suggested in Local-first software: You own your data, in spite of the cloud
Use Distributed State Machines to handle centralized requests in CRDT
Resources
CRDT: Conflict-free Replicated Data Types - Anton Zagorskii - Medium - Good overview of CRDTs
alangibson/awesome-crdt: A collection of awesome CRDT resources
Treedoc: allows concurrently editing a buffer (basically, any sequence of non-modifiable atoms).
LSEQ is a Treedoc alternative.
See this comment in xi-editor for critique of CRDTs and how they are too much overhead for organizing plugin communication.
Briquemont…2015—CRDTs that allow replicas to only hold parts of the data.
Backlinks
- đź“ť CmRDT naturally allow inspecting the history
- đź“ť Figma
- đź“ť Causal Tree
- đź“ť Gun.js
- đź“ť Projectional Editor