Skip to content

Synchronization

Synchronization in Distributed Systems

The Synchronization Problem

Core Challenges

  • Lack of Shared Memory: Traditional synchronization methods (like semaphores) fail because they rely on shared memory, which doesn't exist in distributed systems.
  • Event Ordering: Without a global clock or shared memory, it is difficult to determine which event occurred first (relative ordering).
  • Goal: Processes must cooperate and synchronize using only message passing.

Physical Clocks & Time Measurement

How Computer Clocks Work

  • Hardware Timer: Computers do not have "clocks"; they have "timers." A quartz crystal oscillates at a specific frequency, decrementing a hardware counter. When the counter hits zero, it triggers an interrupt (clock tick) and reloads from a holding register.
  • Software Clock: The OS interrupt handler counts these ticks to maintain the current time.

Clock Skew

  • Because crystals are not perfect, they oscillate at slightly different frequencies.
  • Result: Over time, the software clocks of different machines will drift apart, even if they started at the exact same time. This divergence is called Clock Skew.

Global Time Standards

  • Atomic Clocks: Highly accurate time measurement available in labs (e.g., Paris).
  • TAI (Temps Atomique International): The average of atomic clocks worldwide.
  • UTC (Universal Coordinated Time): The standard civil time used globally.

Clock Synchronization Logic

The "Make" Compilation Problem Discrepancies in time can break logic.

  • Scenario: You edit input.c on machine A (Time: 2144) and compile it on machine B (Time: 2145).
  • The Error: If Machine A is slow and thinks it is 2142, it saves the source file with timestamp 2142. Machine B sees input.o (2144) is "newer" than input.c (2142) and refuses to recompile, even though the source just changed.

Mathematical Model of Drift

  • Definitions:
    • \(t\): The actual "Real" time (UTC).
    • \(C(t)\): The time shown by the computer's internal clock at real time \(t\).
    • \(dC/dt\): The rate at which the computer clock advances relative to real time.
  • Perfect Clock: \(dC/dt = 1\), like \(C_p(t)=t\) for all machines \(p\)

image.png

  • Drift Rate (\(\rho\)): The maximum rate a clock can speed up or slow down (specified by the manufacturer, typically \(10^{-5}\) for modern chips).
  • Formula: The clock drift is bounded by \(1-\rho \le dC/dt \le 1+\rho\).
  • Resynchronization Interval: To keep two clocks within a maximum difference of \(\delta\), they must be resynchronized every \(\delta / 2\rho\) seconds (because they might drift in opposite directions).

Synchronization Algorithms

Cristian's Algorithm

  • Setup: One machine acts as a Time Server (connected to a WWV receiver); all other machines act as clients that stay synchronized with it.
  • The Process:
    1. Sending (\(T_0\)): The Client machine sends a request to the server. The timestamp \(T_0\) is recorded by the Client's clock.
    2. Interrupt (\(I\)): The Server receives the request and processes it (Interrupt handling time \(I\)).
    3. Reply (\(C_{UTC}\)): The Server sends back its current time (\(C_{UTC}\)).
    4. Receiving (\(T_1\)): The Client receives the reply. The timestamp \(T_1\) is recorded by the Client's clock.
  • The Calculation:
    • Goal: The client wants to set its clock to the Server's time, but the message took time to travel.
    • Round Trip Time (RTT): \(RTT = T_1 - T_0\).
    • New Time: The Client sets its time to: \(Server\_time + RTT/2\).
  • Key Assumption: The algorithm assumes the network delay is symmetric, meaning the request and the reply took the exact same amount of time to travel (\(T_{req} \approx T_{res}\)).
  • Gradual Adjustment: Changes are introduced gradually (by adding slightly more/less seconds per interrupt) rather than jumping time instantly.

image.png

Network Time Protocol (NTP)

  • Type: Symmetric / Peer-to-Peer.
  • Scenario: Used over the internet to achieve high accuracy between machines B and A.
  • Mechanism: Measures four timestamps across a request/reply pair:
    • \(T_1\): A sends request.
    • \(T_2\): B receives request.
    • \(T_3\): B sends reply.
    • \(T_4\): A receives reply.
  • Offset Calculation: It calculates the time difference (Offset) between A and B while cancelling out the network delay:
    • \(Offset = ((T_2 - T_1) + (T_3 - T_4)) / 2\).

image.png

  • Action: The system gradually adjusts the clock to minimize this offset rather than jumping the time instantly (the clock will not go backward).

Global Time Standards (UTC)

Universal Coordinated Time (UTC)

  • Definition: A global time standard corrected with "leap seconds" to account for the slowing of the earth's rotation (mean solar day getting longer).
  • Sources for Precise Time:
    • WWV (NIST): Shortwave radio station from Colorado. Accuracy: \(\pm 1\) msec.
    • GEOS: Earth satellites. Accuracy: \(0.5\) msec.
    • Telephone (NIST): Available via modem, but less accurate and cheaper.

Network Time Protocol (NTP)

Architecture

  • Layered System: Uses a hierarchy of "Strata" based on UDP message passing.
    • Stratum 1: Servers directly connected to a precision time source (like an atomic clock).
    • Stratum 2: Servers that sync with Stratum 1, and so on. Accuracy decreases as the strata number increases due to network latency.
  • Failure Robustness: If a Stratum 1 server fails, the hierarchy reconfigures automatically; a backup can become the new primary.

Operational Modes

  • Multicast: One computer periodically broadcasts time to a LAN.
  • Procedure Call: Similar to Cristian's Algorithm (Client requests time \(\rightarrow\) Server replies).

image.png

The Berkeley Algorithm

Concept

  • Type: Active Time Server (Centralized).
  • Usage: Suitable when no machine has a WWV receiver/external source.
  • Mechanism:

    1. Polling: A central "Time Daemon" periodically asks every machine for its time.
    2. Averaging: The daemon computes the average time from all answers.
    3. Adjustment: It tells machines to advance or slow down their clocks to match the average.

    image.png

  • Constraint: If a clock is fast, it is instructed to slow down until the others catch up; time is never set backwards to avoid breaking software logic.

Decentralized Averaging Algorithm

Concept

  • Type: Decentralized / Peer-to-Peer.
  • Mechanism:
    • Time is divided into fixed resynchronization intervals (\(R\)).
    • At the start of an interval, every machine broadcasts its current time.
    • After collecting broadcasts for a set time (\(S\)), each machine runs a local algorithm to compute the new time (typically averaging all values).
  • Variations: To improve accuracy, algorithms may discard the \(m\) highest and \(m\) lowest values (outliers) before averaging.

Logical Clocks (Lamport Timestamps)

Physical vs. Logical

  • Physical Clocks: Must not deviate from real-world time.
  • Logical Clocks: Internal consistency matters more than real time. Processes only need to agree on the order in which events occur, not the time of day.

The "Happens-Before" Relation (\(\rightarrow\)) Defines ordering without a physical clock:

  1. Same Process: If \(a\) and \(b\) are in the same process and \(a\) occurs before \(b\), then \(a \rightarrow b\).
  2. Message Passing: If \(a\) is sending a message and \(b\) is receiving it, then \(a \rightarrow b\).
  3. Transitivity: If \(a \rightarrow b\) and \(b \rightarrow c\), then \(a \rightarrow c\).
  4. Concurrency: If neither \(a \rightarrow b\) nor \(b \rightarrow a\) is true, the events are Concurrent (\(a || b\)). Like they do not exchange messages

image.png

Lamport's Algorithm Rules Every process \(P_i\) maintains a logical counter \(L_i\).

  1. Local Event: Before any event occurs, increment the counter: \(L_i := L_i + 1\).
  2. Sending: When sending message \(m\), attach the timestamp \(t = L_i\).
  3. Receiving: When receiving \((m, t)\), update the local clock to be greater than both the local time and the message time: \(L_j := \max(L_j, t) + 1\).

image.png

Total Ordering Solution

  • Problem: Two events on different machines might end up with the same Lamport timestamp (e.g., both are 40).
  • Solution: To differentiate them, append the Process ID as a decimal.
    • Example: If Process 1 and Process 2 both have an event at time 40, they become 40.1 and 40.2. Since \(40.1 < 40.2\), the tie is broken arbitrarily but consistently.

Total Ordering & Vector Clocks

Lamport Timestamps & Total Ordering

The Replicated Database Problem

  • Scenario: Two users update a replicated database simultaneously.

    • User A sends "Update 1" to Replica 1.
    • User B sends "Update 2" to Replica 2.

    image.png

  • Conflict: If Replica 1 processes 1 -> 2 and Replica 2 processes 2 -> 1, the databases become inconsistent.

  • Requirement: Updates must be ordered the same way across all replicas (Total Ordering).

Total-Ordered Multicast Algorithm Uses Lamport Clocks to ensure identical processing order.

  1. Sending: The sender adds its local logical timestamp to the message and broadcasts it to all nodes.
  2. Receiving: Nodes add the message to a local queue ordered by timestamp, update the local logical time and broadcast an Acknowledgment (Ack) to everyone.
  3. Delivery Rule: A message is only delivered to the application when:
    • It is at the head of the queue.
    • The node has received Acks for that message from all other nodes.
  4. Result: Since all queues are ordered by the same timestamps, every node processes transactions in the exact same sequence.

    image.png

Vector Clocks

Why Lamport Clocks Aren't Enough Lamport clocks ensure total ordering but cannot distinguish whether events are causally related or concurrent (they lose the history). Vector Clocks solve this by capturing the entire causal history.

Structure & Properties Each process \(P_i\) maintains a vector \(VC_i\) (an array of integers):

  • \(VC_i[i]\): The number of events that have occurred locally at \(P_i\) (its own logical clock).
  • \(VC_i[j]\): The number of events \(P_i\) knows have occurred at process \(P_j\).
  • Knowledge: If \(VC_i[j] = k\), then Process \(i\) knows that Process \(j\) has progressed at least to event \(k\).

Algorithm Steps

  1. Local Event: Before executing an event, \(P_i\) increments its own index: \(VC_i[i] \leftarrow VC_i[i] + 1\).
  2. Sending: When sending message \(m\), \(P_i\) attaches its current vector \(VC_i\) as the timestamp \(ts(m)\).
  3. Receiving: Upon receiving \(m\) from \(P_i\), \(P_j\) updates its vector by taking the maximum of every entry \(k\): \(VC_j[k] \leftarrow \max(VC_j[k], ts(m)[k])\). Now that the merge is done, the receipt of the message is considered a "local event." Therefore, the process must apply Rule #1 again and then deliver to the application.

Causal-Ordered Multicast

Uses Vector Clocks to ensure a message is not delivered until all messages that "caused" it have been delivered.

The "Buffer" Logic When node \(P_j\) receives a message \(m\) from node \(P_i\) with vector timestamp \(ts(m)\), it buffers (waits) until two conditions are met:

  1. Sequence Check: \(ts(m)[i] = VC_j[i] + 1\).
    • Meaning: This is exactly the next message from \(P_i\). (We haven't missed any messages from the sender).
  2. Causality Check: \(ts(m)[k] \le VC_j[k]\) for all \(k \neq i\).
    • Meaning: We have already seen all the updates that \(P_i\) saw before it sent this message, such that for every other node (\(k\)) in the system, the sender (\(i\)) must not have seen more messages than you have.
  3. Delivery: Once satisfied, the local clock is updated (\(VC_j[i] = VC_j[i] + 1\)) and the message is delivered.

Example

image.png

  • Violation: If \(P_1\) sends a message requiring \(VC=(1,1,0)\) but \(P_2\) only has \((0,0,0)\), \(P_2\) must wait. It effectively says "This message depends on an update from \(P_0\) that I haven't seen yet".