【Distributed System】Handle Conflicts - Version Vector/ 向量时钟(Vector Clock)

Posted by 西维蜀黍 on 2025-02-02, Last Modified on 2025-04-09

Version Vector/ 向量时钟(Vector Clock)

A version vector is a data structure used to track changes in distributed systems, particularly for conflict resolution in eventual consistency models.

🛠️ Used in:

  • Distributed databases (e.g., DynamoDB, Riak, CRDTs)
  • File synchronization (e.g., Dropbox, Git)
  • Conflict resolution in replicated systems
  • Event ordering & causality tracking

How Version Vectors Work

Each node maintains a list (vector) of counters, one for each node that can modify the data. These counters reflect how many updates each node has made.

Let’s say we have 3 nodes: A, B, and C.

Each node’s version vector might look like this:

Node Version Vector
A A:2, B:1, C:0
B A:1, B:2, C:0

These mean:

  • Node A knows it made 2 changes, B made 1, and C made none.
  • Node B sees itself as having made 2 changes, A made 1, and C made none.

Version A dominates Version B if:

  • For all nodes, A’s vector has equal or greater counters than B’s, and for at least one, A’s counter is greater.
  • If neither dominates the other (some counters higher on each side), the versions are concurrent → meaning they were modified independently and may conflict.

Example - Group Chat Notes

Scenario:

You and two friends (A, B, and C) are taking shared notes for a class. Each of you works on your own copy, but you record how many edits each person has made.

Analogy:

  • A writes 2 edits, so their notes say:“A:2, B:0, C:0”
  • B copies A’s notes and adds their own 1 edit:“A:2, B:1, C:0”
  • C copies from B and adds 1 more:“A:2, B:1, C:1”

Now you know who contributed what, and whose copy is most updated.

Conflict Example:

If A and C both make edits without seeing each other’s changes, you’ll get:

  • A: [A:3, B:0, C:0]
  • C: [A:2, B:0, C:1]

Neither version is newer in all counts. So they are concurrent → meaning you’ll need to merge the notes manually.

Reference