📖Approaches to conflict-free replicated data types

March 4, 2025. Last modified: March 5, 2025

authors: Almeida, Paulo Sérgio
year: 2024
url: https://doi.org/10.1145/3695249

51:4 CAP theorem was proved interpreting strong consistency as linearizability
51:5 Causal consistency is the strongest consistency model achievable while not losing availability
causal consistency itself does not ensure convergence
51:5 Mechanical sympathy
Linearizability runs contrary to how our universe operates.
51:7 If two CRDT values are synchronized independently (i.e., they are two CRDTs), you can’t rely on accessing the state of both CRDTs simultaneously (e.g., to perform binary operation), as they can be out of sync. You can only reliably access one CRDT at a time.
51:15
We also note that long partitions will be a problem in general for op-based (not only pure op-based) CRDTs, due to the need for buffering messages by the reliable messaging middleware. If long partitions or disconnected operation is the norm, then state-based CRDTs are preferable.
causal stability: a message is causally stable at node i when all messages subsequently delivered will have higher timestamp. i.e., a message is causally stable when no more concurrent messages will be delivered.
for state-based crdts, the merge operation must be an inflation but not necessarily monotonic. “What is monotonic is state evolution over time, as a result of applying mutators or join.”
26. op-based CRDT usually assume a known set of replicas (dynamic joining can be implemented but requires some care). state-base CRDTs allow trivial dynamic joining/leaving of replicas (the only requisite for named crdts is to have globally unique replica identifiers).
re partition tolerance:
- because op-based crdts require causal delivery, they tend to store operations that are not yet delivered in memory and may suffer during long network partitioning. “State-based CRDTs are thus more suitable for autonomous operation over unstructured networks with poor connectivity.”
28. using a small number of replicas at tier 0 (e.g., datacenters) as a permanent group. Everyone else is using temporary ids that are then incorporated into the main structure (ref handoff counters[3]).
- i.e., distinguishing permanent and transient replicas
- another approach is renaming ids from client to server identifiers (still under research [58])

Backlinks