Transport Layer¶

Transport-layer Services¶

Transport Layer Fundamentals¶

Logical Communication The Transport Layer provides logical communication between application processes running on different hosts. Ideally, it creates the illusion that the two processes are directly connected, even though they may be on opposite sides of the planet, connected by numerous routers and links.

Transport vs. Network Layer Distinction To understand the difference between these layers, the slides use a "Household Analogy".

Network Layer (The Postal Service): This layer handles logical communication between hosts (houses). It is responsible for moving the data from one computer to another, much like a postal service moves mail from one house mailbox to another.
Transport Layer (Ann and Bill): This layer handles logical communication between processes (kids). Like a parent (Ann) who sorts the mail coming into the house and delivers it to the specific child (process) it is addressed to, the Transport Layer "demultiplexes" data to the correct application socket.
Dependency: The Transport Layer relies on and enhances the services of the underlying Network Layer.

Operational Mechanics¶

Sender Actions

The transport layer at the sending host receives the message from the application layer. It breaks this message into smaller chunks called segments and adds a transport header (\(T_h\)) to each one. These segments are then passed down to the Network Layer (IP) for transmission.

Receiver Actions

At the receiving host, the transport layer accepts the segment from the Network Layer (IP). It checks the header values to ensure integrity and identify the destination. It then reassembles the message and demultiplexes it, passing the data up to the correct application process via its specific socket.

Principal Internet Protocols¶

Internet applications essentially have two choices for transport services, each with distinct trade-offs.

1. TCP (Transmission Control Protocol)

Reliability: Provides reliable, in-order delivery of data, ensuring the byte stream arrives exactly as sent.
Management: Includes sophisticated mechanisms for congestion control (preventing network swamp) and flow control (preventing receiver overwhelm).
Setup: Requires a connection setup phase (handshake) before data can be transferred.

2. UDP (User Datagram Protocol)

Unreliability: Provides an unreliable, unordered delivery service.
Simplicity: It is described as a "no-frills" extension of the underlying "best-effort" IP service, adding very little overhead.

Service Limitations It is important to note that neither TCP nor UDP can provide guarantees for delay (latency) or bandwidth.

Multiplexing and Demultiplexing¶

To support multiple applications running on the same host simultaneously, the transport layer must handle data from multiple sockets and ensure incoming data reaches the correct process.

The Core Mechanisms

Multiplexing (at Sender): The transport layer gathers data chunks from different sockets, encapsulates each chunk with a transport header (which includes port numbers), and passes the resulting segments to the network layer.
Demultiplexing (at Receiver): When the receiving host gets a segment, the transport layer examines the header fields to identify the receiving socket and directs the segment to that specific socket.

How Demultiplexing Works

The host receives IP datagrams, each containing a Source IP address and a Destination IP address. Inside each datagram is a transport-layer segment, which contains a Source Port and a Destination Port. The host uses this combination of IP addresses and port numbers to direct the segment to the appropriate socket.

Connectionless Demultiplexing (UDP)¶

Socket Identification

In UDP, a socket is identified exclusively by its destination port number. When a UDP socket is created, the application must specify the host-local port number (e.g., DatagramSocket(12534)).

When creating datagram to send into UDP socket, must specify the destination IP address and port #

Routing Logic

When a host receives a UDP segment, it checks the destination port number in the segment header and directs the data to the socket with that specific port.

The "Many-to-One" Behavior: If two UDP segments arrive with different Source IP addresses or different Source Port numbers but have the same Destination Port number, they will be directed to the same destination socket.
Implication: The receiving application process receives packets from all senders mixed together in that single socket. That is, IP/UDP datagrams with same dest. port #, but different source IP addresses and/or source port numbers will be directed to same socket at receiving host

Note that if the Destination IP address in the IP datagram does not match the IP address of the host, the host's Network Layer (IP) will typically discard the packet before it ever reaches the Transport Layer (UDP/TCP).

Connection-Oriented Demultiplexing (TCP)¶

Socket Identification (The 4-Tuple) Unlike UDP, a TCP socket is identified by a specific 4-tuple: (Source IP, Source Port, Destination IP, Destination Port). The receiver uses all four values to demultiplex the segment to the correct socket.

Routing Logic A TCP server (like a web server) may support many simultaneous TCP sockets.

Differentiation: Even if two segments are destined for the same IP address (e.g., the Web Server's IP) and the same Destination Port (e.g., Port 80), the system distinguishes them based on their Source IP and Source Port.
Dedicated Sockets: Each unique connection is assigned its own dedicated socket. For example, a segment from Host A and a segment from Host C will be demultiplexed into two completely different sockets on the server, ensuring their conversations remain private and separate.

Mathematically, (Source IP, Source Port, Dest Port) is often "unique enough" to identify a conversation. However, TCP includes the Destination IP as a formal part of the 4-tuple to handle multi-homed hosts and provide absolute architectural rigor.

Multiplexing/demultiplexing happen at all layers. Multiplexing and demultiplexing are fundamental concepts required at all layers of the network stack because each layer must support multiple "clients" from the layer above it.

Segment (The Letter): This is the unit of data at the Transport Layer (TCP/UDP). It contains the actual application data and a header (\(T_h\)) that includes port numbers to identify the correct process.
Datagram (The Envelope): This is the unit of data at the Network Layer (IP). The network layer takes the segment and "wraps" it inside an IP header that contains source and destination IP addresses to identify the correct host.

Connectionless Transport: UDP¶

UDP: User Datagram Protocol¶

UDP is the "no-frills," "bare-bones" transport protocol for the Internet. It is a connectionless service that follows a "send and hope for the best" philosophy.

Key Characteristics¶

Best Effort Service: UDP segments may be lost or delivered out-of-order to the application.
Connectionless: No handshaking occurs between the sender and receiver; each segment is handled independently.
Low Overhead: It has a small header size and no congestion control, allowing it to "blast away" data as fast as desired.
No Delay: No connection establishment means no initial RTT (Round Trip Time) delay.

Common Uses¶

Streaming Multimedia: Used for loss-tolerant, rate-sensitive applications.
DNS (Domain Name System).
SNMP (Simple Network Management Protocol).
HTTP/3: Reliability and congestion control are added at the application layer rather than the transport layer.

UDP Segment Structure¶

The UDP header is exactly 8 bytes long, consisting of four 2-byte fields:

Source Port #: Identifies the sending process.
Dest Port #: Identifies the receiving process.
Length: The length (in bytes) of the entire UDP segment, including the header.
Checksum: Used to detect errors (flipped bits) in the transmitted segment.

UDP Checksum Calculation¶

The goal is to detect bit errors during transmission.

Sender Actions¶

Treat the segment contents (including header and IP addresses) as a sequence of 16-bit integers.
- It is many 16-bit numbers that together make up the whole UDP segment. The checksum is computed over all of them.
Calculate the checksum: Perform an addition (one’s complement sum) of the segment content.
- The “one” in one’s complement literally comes from the idea of “complementing with respect to all ones.” For an n-bit number, the one’s complement is defined as the value that, when added to the original using one’s-complement addition, gives a string of n ones (e.g., for 16 bits: 1111 1111 1111 1111). Practically, this ends up being just a bitwise flip (0→1, 1→0), because flipping all bits is exactly what makes every bit pair add to 1. So it’s called “one’s complement” because you are taking the complement with respect to a number made entirely of ones, not because you “add one” or anything like that.
Place the resulting value into the UDP checksum field.

When sender computes, we can think of the checksum computations 'blanking' out the checksum field i.e. considering it to be 16 0-bits.

Receiver Actions¶

Compute the checksum of the received segment (same way as above).
Compare the computed value with the value in the checksum field.
- Not Equal: Error detected.
- Equal: No error detected (though "weak protection" means some errors might still slip through).

Step-by-Step Example: Internet Checksum¶

This example demonstrates adding two 16-bit integers using one's complement addition.

Step 1: Add the two integers

  1 1 1 0 0 1 1 0 0 1 1 0 0 1 1 0
+ 1 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1
---------------------------------
(1) 1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1  <-- Notice the "carryout" (1)

Step 2: Handle the Wraparound In one's complement, a carryout from the most significant bit must be added back to the result.

    1 0 1 1 1 0 1 1 1 0 1 1 1 0 1 1
+                                 1  <-- Adding the carryout
-----------------------------------
Sum: 1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0

Step 3: Calculate the Checksum The final checksum is the 1's complement (flip all bits) of the sum.

Sum:      1 0 1 1 1 0 1 1 1 0 1 1 1 1 0 0
Checksum: 0 1 0 0 0 1 0 0 0 1 0 0 0 0 1 1

While the the note describes the receiver "comparing" the values above, the mathematical reason for storing the One's Complement (the inverted sum) is to simplify the verification logic:

Protocol Convention: The sender calculates the sum and then inverts it (flips the bits) before sending.
The Check: Although here it describes the receiver re-calculating and comparing, the mathematical property of One's Complement allows a receiver to simply add the Checksum to the rest of the segment.

When the receiver reads the packet, they treat the entire segment (header + payload) as a sequence of 16-bit integers and simply add them up. What that means is, the receiver just sums everything it receives (i.e. the sender's checksum field and the rest of the segment), since the checksum is the one’s complement sum (flip bit) of the rest of the segment computed by sender, by adding the rest of the segment (receiver), we should get all 1s if the result is correct. Addition here no magic, simply adding all of them together.
- If the data is correct, the result will be all 1s (1111111111111111). Otherwise → error
- This "all 1s" check is extremely fast and easy for hardware to verify.
```
sum(data) = checksum + rest of segment = ~sum + sum = 1111 1111 1111 1111
```
Example

Step A: The Receiver calculates the Raw Sum of the data:
Result:    1 0 1 1 0 0 1 1 1 0 1 1 0 1 0 1   (This is "The Data")

Step B: The Receiver reads the Checksum sent by the Sender:
Checksum:  0 1 0 0 1 1 0 0 0 1 0 0 1 0 1 0   (This is the "Inverse")

Step C: The Receiver adds them together:
   1 0 1 1 0 0 1 1 1 0 1 1 0 1 0 1
+  0 1 0 0 1 1 0 0 0 1 0 0 1 0 1 0
----------------------------------
   1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1   <-- "All 1s"

The "Weak Protection" Limitation¶

The checksum is not foolproof. Multiple bit flips can "cancel each other out," resulting in the same checksum value even though the data has changed.

Example of an undetected error:

Original:
Integer A: ...0 0 1... (Ends in 1 0)
Integer B: ...1 0 0... (Ends in 0 1)

Corrupted (Bit Flips):
Integer A: ...0 0 0... (Ends in 0 1) <-- Flipped
Integer B: ...1 0 1... (Ends in 1 0) <-- Flipped

Result:
The vertical sum is identical, so the Checksum remains valid.

Principles of Reliable Data Transfer¶

Service Abstraction vs. Implementation¶

Reliable Service Goal: The objective is to provide an abstraction where the sending and receiving processes communicate through a "reliable channel," unaware of the underlying network's flaws.

Implementation Reality: The actual network is an unreliable channel that may corrupt, lose, or reorder data. The complexity of the RDT protocol depends strongly on which of these specific faults the channel exhibits.

However, sender and receiver do not know the state of each other, e.g., was a message received? unless communicated via a message

Protocol Interfaces (API)¶

The protocol interacts with the application and the network through four specific function calls

rdt_send(): Called by the sending application to pass data down to the protocol layer.
udt_send(): Called by the RDT protocol to transfer the packet over the unreliable channel to the receiver.
rdt_rcv(): Called when a packet arrives from the channel at the receiving side.
deliver_data(): Called by the RDT protocol to deliver validated data up to the receiving application.

Finite State Machine (FSM) Strategy¶

Design Approach: The protocol is developed incrementally using Finite State Machines (FSM) to specify the behavior of the sender and receiver.

FSM Structure:
- States (Circles): Represent the system waiting for an event (e.g., "Waiting for data from app").
- Transitions (Arrows): Triggered by events (labels above the line) causing specific actions (labels below the line).
Directionality: We consider unidirectional data transfer (data flows only sender -> receiver), but control info (like ACKs) flows in both directions.

rdt1.0: Reliable Transfer over a Reliable Channel¶

Assumption: The underlying channel is perfectly reliable—no bit errors and no packet loss.
Simplicity: Because the channel never fails, no feedback (ACKs) is needed.
- Sender: Simply accepts data and sends it.
- Receiver: Simply reads data and delivers it.

From Above: Request from the App to send data.
From Below: A packet arriving from the network wire.

rdt2.0: Channel with Bit Errors¶

Assumption: The channel is unreliable and may flip bits (corrupt data), though packets are not lost.
Error Recovery Mechanisms: To fix bit errors, the protocol mimics human conversation:
- Checksum: A code added to the packet to detect if bits were flipped.
- Acknowledgements (ACKs): The receiver explicitly tells the sender the packet was received OK.
- Negative Acknowledgements (NAKs): The receiver explicitly tells the sender the packet had errors.
- Retransmission: If the sender receives a NAK, it re-sends the packet.

Sender FSM (Stop-and-Wait)¶

The sender operates in a Stop-and-Wait mode, meaning it cannot send new data until the current packet is confirmed.

The symbol \(\Lambda\) (uppercase Lambda) represents "No Action" or "Do Nothing.”

State 1 (Wait for Data): When rdt_send(data) is called, the sender creates a packet (with checksum) and sends it. It then transitions to State 2.
State 2 (Wait for ACK/NAK):
- If NAK received: The sender re-transmits the packet and stays in State 2.
- If ACK received: The sender knows the data is safe, does nothing, and returns to State 1 to wait for new data.
Stop and Wait: sender sends one packet, then waits for receiver response
- Stop: After sending one single packet, the sender stops working on new data. It effectively "blocks" the application layer from sending anything else. It refuses to accept a new rdt_send() call.
- Wait: The sender enters a waiting state and sits idle until it receives a response (ACK or NAK) from the receiver regarding that specific packet.

FSM Specification and Operation¶

Sender FSM: The sender waits for a call from the application, builds a packet with a checksum, and sends it. It then enters a "wait state" where it listens for feedback. If it receives a NAK (indicating error), it re-transmits the packet. If it receives an ACK (indicating success), it returns to the initial state to await new data.
Receiver Unknown State: A key challenge is that the sender does not know the receiver's state (i.e., whether the message arrived correctly) unless explicitly told via the protocol.
Operational Traces:
- No Error Scenario: The sender transmits data -> receiver successfully validates it -> receiver sends ACK -> sender transitions to next state.
- Corrupted Packet Scenario: The sender transmits data -> receiver detects corruption -> receiver sends NAK -> sender receives NAK and re-transmits the packet.

The Fatal Flaw of rdt2.0¶

The Problem: The protocol assumes the feedback channel (ACK/NAK) is perfectly reliable, but ACKs and NAKs can also be corrupted.
Consequence: If an ACK/NAK is garbled, the sender does not know if the receiver got the data. It cannot simply retransmit because the receiver won't know if the new packet is fresh data or a duplicate of the previous one.
The Solution (Sequence Numbers): To handle duplicates, the sender adds a sequence number to each packet. The receiver can then check this number to discard duplicates.

rdt2.1: Handling Garbled Feedback¶

Design Updates:
- Sender: Adds a sequence number (0 or 1) to packets. Two numbers are sufficient because the protocol is "stop and wait" (one packet at a time). The sender must now check if received ACKs/NAKs are corrupt.
- Receiver: Must check if a received packet is a duplicate. The receiver's state indicates whether it is expecting sequence number 0 or 1.
rdt2.1 Sender FSM: The state machine doubles in size to handle sequence numbers. It now has four states: "Wait for call 0", "Wait for ACK/NAK 0", "Wait for call 1", and "Wait for ACK/NAK 1".

rdt2.1 Receiver FSM: The receiver also has distinct states for waiting for packet 0 and packet 1. If it receives a packet with the wrong sequence number (a duplicate), it sends an ACK but does not deliver the data.

Data Packet Checksum: Protects the payload (file/message) sent from Sender → Receiver.
ACK Packet Checksum: Protects the feedback (status) sent from Receiver → Sender. If the ACK packet (the "reverse package") is corrupted, the sender will also resend the message.

rdt2.2: A NAK-Free Protocol¶

Concept: It is possible to achieve the same functionality as rdt2.1 without using NAKs. TCP uses this approach.
Mechanism:
- Instead of sending a NAK for a corrupted packet, the receiver sends an ACK for the last correctly received packet.
- The receiver must explicitly include the sequence number of the packet being ACKed.
Duplicate ACKs: If the sender receives a duplicate ACK (e.g., it sent packet 1 but receives another ACK for packet 0), it acts as a NAK and retransmits the current packet.
FSM Changes: The Sender FSM checks the sequence number inside the ACK (isACK(rcvpkt, 1)). If the number is old, it re-sends.

In rdt2.2, the sender DOES treat a duplicate ACK effectively as a NAK.

rdt3.0: Channels with Errors and Loss¶

New Assumption: The underlying channel can now lose packets (both data and ACKs) in addition to corrupting them.
Mechanism (Timer): To handle loss, the sender waits a "reasonable" amount of time for an ACK. If no ACK is received by the deadline (timeout), the sender retransmits the packet.
- Handling Duplicates: If a packet (or ACK) is just delayed and not lost, the retransmission will create a duplicate. The existing sequence numbers (0 and 1) allow the receiver to detect and discard these duplicates.
- Receiver Responsibility: The receiver must explicitly specify the sequence number of the packet being ACKed.

Sender FSM¶

The sender FSM now incorporates a countdown timer to handle potential losses.

State Structure: It alternates between four states: "Wait for call 0", "Wait for ACK 0", "Wait for call 1", and "Wait for ACK 1".

Key Transitions:
- Sending: When sending a packet (udt_send), the sender triggers start_timer.
- Timeout: If the timer expires (timeout event), the sender executes udt_send(sndpkt) (retransmission) and restarts the timer (start_timer).
- Success: If the correct ACK is received (!corrupt && isACK), the sender stops the timer (stop_timer) and moves to the next state.
- Corruption/Bad ACK: If the sender receives a corrupted packet or the wrong ACK (e.g., ACK 1 while waiting for ACK 0), it does nothing (\(\Lambda\)) and keeps the timer running.

In rdt3.0, the sender ignores duplicate ACKs completely. It relies exclusively on the countdown timer (timeout) to trigger a retransmission.

Operation Scenarios (Traces)¶

The protocol's behavior changes based on what gets lost or delayed.

(a) No Loss: Standard operation. Sender sends pkt0 \(\to\) Receiver gets pkt0/sends ack0 \(\to\) Sender gets ack0/sends pkt1.
(b) Packet Loss: Sender sends pkt1 \(\to\) pkt1 is lost \(\to\) Sender times out \(\to\) Sender resends pkt1 \(\to\) Receiver gets pkt1.

(c) ACK Loss: Receiver gets pkt1 and sends ack1 \(\to\) ack1 is lost \(\to\) Sender times out and resends pkt1 \(\to\) Receiver detects duplicate (already has pkt1), discards it, and re-sends ack1.
(d) Premature Timeout: Sender sends pkt1 \(\to\) timeout occurs before ack1 arrives (due to delay) \(\to\) Sender resends pkt1 \(\to\) Sender receives the original ack1 \(\to\) Sender receives the ack1 from the retransmission (duplicate ACK) and ignores it.

Performance Analysis (Stop-and-Wait)¶

Utilization (\(U_{sender}\)): Defined as the fraction of time the sender is actually busy sending bits into the channel.
Calculation Example:
- Parameters: Link speed \(R = 1 \text{ Gbps}\), Propagation delay \(15 \text{ ms}\), Packet size \(L = 8000 \text{ bits}\).
- Transmission Delay (\(D_{trans}\)): is defined as the amount of time required to push all of the packet's bits into the link (channel)
  - This is different from propagation delay (the time it takes for the signal to travel across the wire). Transmission delay is purely about how fast the sender can get the data out of its own interface.
\[ D_{trans} = \frac{L}{R} = \frac{8000 \text{ bits}}{10^9 \text{ bits/sec}} = 8 \text{ microsecs} \]
- Utilization Calculation:
\[ U_{sender} = \frac{L/R}{RTT + L/R} = \frac{0.008 \text{ ms}}{30.008 \text{ ms}} \approx 0.00027 \]
Conclusion: The rdt3.0 protocol performance "stinks" because the sender spends most of its time waiting for the RTT (Round Trip Time). The protocol limits the capabilities of the underlying hardware.

The Solution: Pipelining¶

Concept: To fix low utilization, the sender allows multiple "in-flight," yet-to-be-acknowledged packets.
Requirements:
- The range of sequence numbers must be increased (0 and 1 are no longer enough).
- Buffering is required at the sender and/or receiver.
Visual Difference: Stop-and-wait sends one packet across the map; Pipelining fills the pipe with many packets at once.

Performance Impact: Pipelining increases utilization significantly. For example, allowing 3 packets in-flight increases utilization by a factor of 3.
Utilization Formula:

\[ U_{sender} = \frac{3L/R}{RTT + L/R} \]

The denominator represents the total time of one "cycle" (from the moment you start sending the first packet until the moment you get the confirmation back).

Go-Back-N (GBN)¶

GBN is a specific implementation of Pipelining.

Sender Mechanism¶

Sliding Window: The sender is allowed to transmit up to \(N\) packets without waiting for an acknowledgment.
- Window Size (\(N\)): The maximum number of unacknowledged packets allowed in the pipeline.
- State Tracking: The sender tracks send_base (oldest unacked packet) and nextseqnum (next usable sequence number).
Cumulative ACK: An acknowledgment ACK(n) means "I have received all packets up to and including sequence number \(n\)".
- Action: Upon receiving ACK(n), the sender moves its window forward to begin at \(n+1\).
Timer Handling: The sender uses a single timer for the oldest in-flight packet. If this timer expires, the sender retransmits all unacknowledged packets (from send_base to nextseqnum - 1).
- timeout(n): When the timer expires for the oldest unacknowledged packet (n), the sender does not just re-send that one lost packet. Instead, it "goes back" to packet n and re-sends the entire window of packets that followed it.

Receiver Mechanism¶

ACK-Only Strategy: The receiver only sends ACKs for correctly received in-order packets.
- It keeps track of rcv_base, the sequence number of the next expected packet.
Handling Out-of-Order Packets: If the receiver gets a packet that is not the one it expects (e.g., waiting for 2 but gets 3):
- Discard: It throws the packet away (does not buffer it). (It can also buffer, but this is an implementation decision)
- Re-ACK: It resends an ACK for the last correctly received in-order packet (e.g., sends ACK 1 again).

Go-Back-N in Action (Scenario Trace)¶

Normal Operation: Packets 0 and 1 are sent and ACKed successfully. The window slides forward.
Packet Loss Event:
1. Sender: Sends packets 2, 3, 4, 5.
2. Loss: Packet 2 is lost in transit.
3. Receiver: Receives packets 3, 4, and 5. Because they are out-of-order (it is waiting for 2), it discards them and sends duplicate ACKs for packet 1.
4. Sender Timeout: The timer for packet 2 expires.
5. Go-Back-N: The sender goes back to packet 2 and retransmits everything in the current window (packets 2, 3, 4, and 5).

The sender does not treat a duplicate ACK as a NAK. It simply ignores them. The sender relies exclusively on the timeout to trigger a retransmission.

Protocol	Duplicate ACK Behavior	Trigger for Retransmission
rdt 2.2	Treated as NAK	Duplicate ACK
rdt 3.0	Ignored (\(\Lambda\))	Timeout Only
GBN	Ignored (\(\Lambda\))	Timeout Only

Questions¶

Why the segment will contain IP address? Isn’t it discarded by Network Layer?

Treat the segment contents (including header and IP addresses) as a sequence of 16-bit integers.