📝CRDT
Conflict-free replicated data type
CRDT is a data type that can be replicated across different nodes and be modified concurrently and independently. All inconsistencies are guaranteed to resolve.
Two main classes of CRDTs are:
- CmRDT (commutative), operations-based
- Operations are transferred and re-applied on the other side.
- Limitations:
- require exactly once delivery
- it’s hard to do GC
- require operations to be executed individually on every node (even if they could be batched)
- CmRDT naturally allow inspecting the history
- CvRDT (convergent), state-based
- State is transferred and merged.
- Limitations:
- high overhead from shipping state
- With the broad definition, both CmRDT and CvRDT are expressible through one another (not possible with pure op-based CRDT).
- CvRDT -> CmRDT: imagine the state being a log of all operations applied — now you have a CmRDT.
- CmRDT -> CvRDT: let each operation be “apply this state.” This effectively is a CvRDT.
- Delta-based
- a hybrid of CmRDT and CvRDT: store as state but send delta operations on syncing
- There are also Pure Operation-Based Replicated Data Types, a stricter definition of operations-based CRDTs.
- CmRDT prepare an effector from operation and current state, which allows using state as an effector. Pure op-based CRDT prohibit this and require effector to be created without access to the state.
Typical data structures:
- G-Set—grow-only set.
- 2P-Set—two-phase set. Consists of two G-Sets: added and removed elements. Allows removing elements, but does not allow re-adding them.
- C-Set—Counter Set. A count is stored for each element. Might be called PN-Set (because it stores a PN-Counter).
- OR-Set—Observed Remove Set. An element can only be removed if remove operation is caused by add operation (i.e., happens before). I believe this requires vector clocks.
- LWW-Register—last write wins. Timestamp (logical) + site id (to break ties) + value
- LWW-Set—like LWW-Register, but for inclusion of an element. Two flavors are Add-Wins (AWSet) and Remove-Wins (RWSet), which determine what happens if an element is added and removed concurrently.
- MV-Register—Multi-Value Register. In case of concurrent writes, all values are stored in a set.
- G-Counter—grow-only counter.
- PN-Counter—positive-negative counter. Store two G-Counters (for increment and decrement).
Projects
- Yjs a js framework for CRDT editing of different data types.
- automerge/automerge: A JSON-like data structure
- Antidote DB
- [gritzko/swarm: JavaScript replicated model](https://github.com/gritzko/swarm) and Replicated Object Notation — RON
- Gun.js
- [smoothers/cause — An EDN-like CRDT (Causal Tree) for Clojure & ClojureScript that automatically tracks history and resolves conflicts. | from Causal Tree]
See also
- CRDTs are a good fit for Local-First Software as suggested in Local-first software: You own your data, in spite of the cloud
- Causal Tree
- Use Distributed State Machines to handle centralized requests in CRDT
Resources
- Conflict-free replicated data type - Wikipedia
- CRDT: Conflict-free Replicated Data Types - Anton Zagorskii - Medium - Good overview of CRDTs
- alangibson/awesome-crdt: A collection of awesome CRDT resources
- Treedoc: allows concurrently editing a buffer (basically, any sequence of non-modifiable atoms).
- LSEQ is a Treedoc alternative.
- Readings in conflict-free replicated data types
- Pure Operation-Based Replicated Data Types #paper
- Causal Tree
- See this comment in xi-editor for critique of CRDTs and how they are too much overhead for organizing plugin communication.
- Almeida…2016
- Briquemont…2015—CRDTs that allow replicas to only hold parts of the data.
- Almeida & Shapiro2025
Backlinks
- 📝 CmRDT naturally allow inspecting the history
- 📝 Figma
- 📝 Causal Tree
- 📝 Gun.js
- 📝 Projectional Editor