Skip to content

Overview of Computer Networks

Architecture, Layers, and Protocols

The Communication Model (Client-Server)

Scenario:

A client (Computer A) wants to communicate with a server (Computer B) to access a service (e.g., Google or a Tomcat application).

image.png

  • Actors:
    • A (Client): The user's computer initiating the request.
    • B (Server): The machine hosting the application (e.g., Apache Web Server, Tomcat).
    • R1, R2, R3 (Routers): Network devices that direct traffic. In the direct path from A to B, R2 acts as the central gateway, while R1 and R3 represent connections to other parts of the internet.
  • Process:
    • A sends a Request.
    • R2 routes it to B.
    • B processes it and sends a Response back via R2.

Core Communication Issues & Solutions

Communication is not automatic; specific challenges must be solved at different stages.

Issue Solution (Layer) Concept (Spoiler!)
How to put data on the wire? Physical (L1) Converts digital bits into analog signals (voltage, light).
How to choose R2 vs L (neighbor)? Data Link (L2) Uses MAC Addresses to identify devices on the same physical link.
How to choose path to destination? Network (L3) Uses IP Addresses and routing tables to find the path across the internet.
How to distinguish Apps (Tomcat vs Apache)? Transport (L4) Uses Port Numbers to deliver data to the correct process.
What if data is lost? Transport (L4) Protocols like TCP handle reliability and retransmission.

Network Architecture: Layered Protocols (Protocol Stack)

To manage complexity, network functionality is divided into layers. Each layer performs one specific function and communicates with the layers above/below via defined interfaces.

image.png

image.png

When sending data, you go down the stack (adding headers). When receiving data, you go up the stack (removing headers).

A. The Internet Protocol Stack (5 Layers)

This is the practical model used for the actual Internet.

  1. L5: Application Layer
    • Function: Supports network applications (FTP, SMTP, HTTP).
    • Data Unit: Message
  2. L4: Transport Layer
    • Function: Host-to-host data transfer (TCP, UDP). Handles reliability and port addressing.
    • Data Unit: Segment
  3. L3: Network Layer
    • Function: Routing from source to destination (IP, Routing Protocols).
    • Data Unit: Datagram
  4. L2: Data Link Layer
    • Function: Data transfer between neighboring elements (Ethernet, PPP, Wi-Fi). Handles MAC addressing.
    • Data Unit: Frame
  5. L1: Physical Layer
    • Function: Bits "on the wire."

image.png

B. The ISO OSI Model (7 Layers)

A theoretical model often used for reference. It adds two layers between Application and Transport:

  • Presentation (L6): Data representation and encryption (e.g., SSL/TLS).
  • Session (L5): Session management and automatic recovery.

Encapsulation & Message Format

As data moves down the stack, each layer wraps the previous data with its own information.

image.png

  • Headers: Added by L2, L3, L4, L5, etc. to the front of the message (L5 first, and L2 last).
  • Trailers: Uniquely added by the Data Link Layer (L2) to the end of the message (for error checking).
  • Structure:[ L2 Header | L3 Header | L4 Header | ... Application Data ... | L2 Trailer ]

Encapsulation (The "Creation" View)

This process happens Inside-Out. When your computer prepares to send data, it builds the packet like wrapping a gift in layers of paper.

  • Step 1 (Core): Start with the Application Data (The Request).
  • Step 2 (Layer 4): Add Transport Header (TCP/UDP).
  • Step 3 (Layer 3): Wrap that entire package inside an IP Header (Source/Dest IP).
  • Step 4 (Layer 2): Finally, wrap that entire IP packet inside an L2 Header (Source/Dest MAC).
  • Result: The L2 Header is the "outermost" layer.

Transmission Order (The "Wire" View)

This process happens Outside-In. When the data is actually sent over the wire, the outer layer must go first so the receiver knows what to do immediately.

  • Visualizing the Diagram: In network diagrams like the one below, Left to Right usually represents Time (First bit sent \(\rightarrow\) Last bit sent).

image.png

  • The Order:
    1. L2 Header: Arrives first.
    2. IP Header: Arrives second.
    3. Data: Arrives last.

Decapsulation (The Receiver's View)

The receiving computer processes the data by "peeling the onion" layer by layer.

  • Check L2 First: The hardware (Network Card) reads the L2 Header first. If the Destination MAC doesn't match its own, it discards the packet immediately without even looking at the IP header.
  • Check IP Second: If the MAC matches, it strips the L2 Header and passes the rest to the OS. The OS reads the IP Header.
  • Process Data: If the IP matches, it strips the IP Header and passes the Data to the Application.

Focus on Ethernet networks.

The L2 Header is like the shipping label on an envelope. The MAC addresses are the "To" and "From" addresses written on that label.

  • Function: Media Access Control (MAC), Error Detection, Node Identification.
  • Addressing (MAC Addresses): Used for "hop-by-hop" delivery.

    • Example: Sending from A to Router R2.
    • Source: MAC_A
    • Destination: MAC_R2
    • Note: When R2 forwards the packet to B, it changes the Source to MAC_R2 and Destination to MAC_B.

    image.png

  • "Rest of L2 Header": Refers to fields like EtherType (identifies the protocol inside, e.g., IPv4) and Preamble (synchronization bits).

Layer 3: The Routing Layer

Goal:

The primary purpose of Layer 3 is to move packets among routers from the original source to the final destination.

  • Datagram Network:
    • The network is connectionless; every packet (datagram) is treated independently.
    • Analogy: It is like driving a car and stopping to ask for directions at every intersection. The destination determines the next hop, but the route might change during the session.
    • Distributed Routing Protocols: Routers talk to each other to adapt to network conditions (traffic, failures) and update their maps efficiently.

IP Addressing

To move data, every computer needs a unique identifier called an IP Address.

  • Versions:
    • IPv4: 32-bit address (The focus of these contents).
    • IPv6: 128-bit address.
  • Packet Layout:

    • An IP Packet consists of a Header and Data.
    • The Header contains the IP Address of Source and IP Address of Destination.
    • The total packet size can be up to 64 kilobytes.

    image.png

IP Address Classes (IPv4 Structure)

IP addresses are divided into "Classes" to manage networks of different sizes. The address is split into two parts: Network ID (identifies the group) and Host ID (identifies the specific machine).

image.png

Class Starts With (Binary) Format (Bits) Decimal Range (1st Octet) Intended Use
Class A 0... 7 bit Network / 24 bit Host 1 to 127 Very large networks (Few nets, many hosts).
Class B 10... 14 bit Network / 16 bit Host 128 to 191 Medium networks.
Class C 110... 21 bit Network / 8 bit Host 192 to 223 Small networks (Many nets, few hosts).
Class D 1110... Multicast Address 224 to 239 Multicasting (Sending to a group).
Class E 1111... Reserved 240 to 255 Experimental/Unused.

image.png

Class D: Multicast

  • Identifier: Starts with binary 1110.
  • Range: 224.0.0.0 to 239.255.255.255.
  • Function: This class is used for Multicasting.
    • In a standard network (Unicast, that is, one-to-one - class A, B and C), if you want to send a video stream to 50 people, you have to send 50 separate copies of the data.
    • In Multicast (Class D), you send one single stream to a specific "Multicast Address" (like a radio frequency). The routers in the network are smart enough to duplicate the packet only when necessary to reach the users who have "subscribed" to that group.
  • Structure: There is no division between "Network ID" and "Host ID." The entire 28 remaining bits identify the multicast group.

Class E: Reserved

  • Identifier: Starts with binary 1111.
  • Range: 240.0.0.0 to 255.255.255.255.
  • Function: This class is Reserved for future use or experimental research.
  • Usage: You effectively never see these addresses on the public internet. If you try to assign a Class E IP address to your computer, most operating systems will reject it as invalid because it is not meant for standard network traffic.

Routing Tables

Routers use internal lookup tables to decide where to send a packet next.

Use Routing Protocols to maintain routing tables, and route packets.

  • Routing Protocols: Routing protocols are the software algorithms used by routers to talk to each other and exchange information about the network's layout. Their main purpose is to build and maintain the Routing Tables.

image.png

  • How it works: The router looks at the Destination IP, checks its table for the best "Link" (path), and forwards the packet.
  • Cost: Tables often include a "Cost" metric (e.g., number of hops or latency) to find the most efficient path.
  • Example:
    • Router A wants to send to E.
    • Table says: Go via Link 1.
    • Router B receives it, looks at its own table, and forwards to E via Link 4.

Data Flow & Encapsulation (Layer 3 Focus)

When a packet moves through a router (like R2), the data moves up and down the protocol stack.

image.png

image.png

  1. Incoming (A to R2):
    • R2 receives the signal (Physical).
    • R2 reads the L2 Header (Source: MAC_A, Dest: MAC_R2) and strips it off.
    • R2 looks at the IP Header (Dest: IP_B) to decide where to go next.
  2. Outgoing (R2 to B):
    • R2 creates a new L2 Header.
    • New Source: MAC_R2.
    • New Destination: MAC_B (or the next router).
    • The IP Header (Source: IP_A, Dest: IP_B) remains unchanged.

Network Data Transfer

Packet Switching vs. Circuit Switching

There are two fundamental ways to connect computers.

  • Circuit Switching (Old Model):
    • Like the traditional telephone network.
    • Establishes a dedicated circuit for the entire duration of the call.
    • Pros: Guaranteed performance (bandwidth).
    • Cons: Wasted resources if silence occurs; no one else can use that line.
  • Packet Switching (Internet Model):
    • Data is broken into discrete "chunks" called packets.
    • Statistical Multiplexing: Users (like A and B) share network resources. Packets from different users mix together in the line based on demand.
    • Pros: More efficient; allows more users to share the same link.
    • Cons: Can cause congestion and delays if too many people send data at once.

How Packet Switching Works

  • Store and Forward: A router cannot send a packet until it has received the entire packet from the previous hop. It must "store" the bits first, then "forward" them.
  • Resource Contention: If the amount of data arriving exceeds the capacity of the outgoing link, packets must wait in a Queue (buffer). This creates congestion and delay.

image.png

Sources of Delay (The 4 Types)

When a packet moves from node to node, it experiences four distinct types of delay:

image.png

  1. Nodal Processing Delay:
    • The time the router takes to check for bit errors and determine the output link.
  2. Queuing Delay:
    • Time spent waiting in the buffer for the line to become free. This depends heavily on network congestion.
  3. Transmission Delay:
    • The time required to push the bits onto the wire.
    • Formula: \(L/R\) (Packet Length - bits / Link Bandwidth - bps).
  4. Propagation Delay:
    • The time it takes for the signal to physically travel across the wire.
    • Formula: \(d/s\) (Distance - length of physical link / Speed of medium - ~\(2 \times 10^8\) m/sec).

Transport Services

Transport Layer Services (Layer 4)

The Transport Layer is responsible for End-to-End communication.

  • Multiplexing/Demultiplexing:

    • It allows multiple applications (e.g., Email, Web, Games) to run on the same computer and use the network simultaneously.
    • Mechanism: It uses Port Numbers (Source Port and Destination Port) to distinguish between these applications.

    image.png

  • Encapsulation:

    • The Transport Layer adds its own header (containing the ports) inside the IP packet.

      [ IP Header | (Transport Header | App Data) ]

TCP/IP Layers

image.png

1. The Direction (The Black Arrow)

The arrow points downwards, meaning this is what happens when you send data (e.g., clicking "Send" on an email). The data starts at the top and is handed off to lower layers one by one.

2. The Transformation at Each Layer

As the arrow passes through each circle (interface), the data changes its "form" and "name":

  • Application Layer:
    • Input: The raw "Message" (your email text, image, etc.).
  • Transport Layer Boundary:
    • "Messages (UDP) or Streams (TCP)": The application hands the data to the Transport layer.
      • If using TCP (reliable), the data is treated as a continuous stream of bytes (like water in a pipe).
      • If using UDP (fast/unreliable), the data is treated as distinct messages (chunks).
    • "UDP or TCP packets": The Transport layer chops that stream/message into manageable pieces (segments) and adds port numbers.
  • Internet Layer Boundary:
    • "IP Datagrams": The Network layer takes those TCP/UDP packets, adds IP addresses (Source/Dest IP), and calls the result a Datagram.
  • Network Interface Boundary:
    • "Network-specific frames": Finally, the Link layer adds MAC addresses and error checking to create a Frame that is compatible with the specific hardware (Ethernet, Wi-Fi, ATM) you are using.

Application Service Requirements

Different applications need different things from the Transport Layer. Not all apps need reliability.

Application Data Loss Tolerance Bandwidth Requirement Time Sensitive?
File Transfer / Email No Loss (Must be reliable) Elastic (Can be slow or fast) No
Web Documents No Loss Elastic No
Real-time Audio/Video Loss-Tolerant (Glitches are okay) High (Needs minimum speed) Yes (Must be fast)
Interactive Games Loss-Tolerant Low to Medium Yes (Low latency is key)
- Key Insight: "Do we always need reliable communication?"
**No.** Real-time apps (Zoom, Games) often prefer speed over perfect accuracy. If a packet is lost in a live video call, there is no point in resending it because the moment has passed.

Application Layer & Transport Services

The Application Layer (Layer 5)

  • Definition: The Application Layer consists of communicating, distributed processes running in the "user space" of network hosts.
  • Architecture:
    • Applications exchange messages to implement functionality (e.g., Email, File Transfer, Web).
    • Note: The network core (routers) does not run application code; logic is only at the endpoints.
  • What is a Protocol?
    • It defines the format and order of messages sent/received, and the actions taken upon transmission or receipt.
    • Analogy: Human protocol ("What's the time?" -> "2:00") vs. Computer protocol (Connection Req -> Connection Reply).

The Programmer's View (API)

  • API (Application Programming Interface) is the general definition for the interface between the Application and Transport layers.
  • The Socket is the specific "Internet API" implementation of that interface; it is the concrete mechanism developers use to send data into and read data out of the network.
  • Abstraction: To a programmer, the internet looks like two choices of service sitting on top of IP:

    1. TCP: Reliable stream.
    2. UDP: Unreliable datagrams.

      image.png

  • Interaction: Processes communicate by sending data into the socket and reading data out of the socket.

Transport Services: TCP vs. UDP

The Transport layer provides "Host-to-Host" data transfer.

TCP (Transmission Control Protocol)

  • Service: Reliable and Ordered delivery.
  • Key Features:
    • Connection-Oriented: Setup is required between client and server before data moves.
    • Flow Control: Ensures the sender does not overwhelm the receiver.
    • Congestion Control: Throttles the sender when the network is overloaded.
  • Does NOT Provide: Timing guarantees or minimum bandwidth guarantees.

UDP (User Datagram Protocol)

  • Service: Unreliable data transfer.
  • Key Features:
    • "Best effort" delivery (no guarantees).
    • No connection setup, no flow control, no congestion control.

How TCP Reliability Works

How does TCP make an unreliable network (IP) reliable?

  1. Sequence Numbers: The sender adds a sequence number to every packet.
  2. Buffering: The receiver buffers messages and reorders them before delivering to the application.
  3. ACKs (Acknowledgements): The receiver sends a message back saying "I got it."
  4. Retransmission: If the sender doesn't get an ACK within a certain time, it automatically resends the data.

image.png

Application Protocols & Transport Mapping

Different applications choose their transport protocol based on their needs.

Application Application Protocol Underlying Transport RFC Standard
Email SMTP TCP RFC 821
Remote Terminal Telnet TCP RFC 854
Web HTTP TCP RFC 2068
File Transfer FTP TCP RFC 959
Streaming Multimedia Proprietary TCP or UDP -
Internet Telephony Proprietary Typically UDP -

TCP Message Encapsulation Example

Visualizing a specific packet traveling over Ethernet:

  • Inner: Application Message.
  • Layer 4: Wrapped in TCP Header (adds Port).
  • Layer 3: Wrapped in IP Header (adds IP Address).
  • Outer: Wrapped in Ethernet Header (adds MAC Address).

TCP Message Encapsulation over an Ethernet

TCP Message Encapsulation over an Ethernet

TCP/IP Layers

image.png

Network Failures & Socket Programming

Network Failures

Distributed systems must deal with various types of failures in the network core.

  • Node Failures: Machines can crash or become unreachable, requiring the application to build failure detectors.
  • Data Issues: Data can be lost, corrupted, or reordered during transmission. These specific issues are typically handled by TCP (Transport Layer) so the application doesn't have to fix these errors.
    • If a packet arrives corrupted (the bits are wrong), TCP discards it and waits for the sender to retransmit it (treating it just like a lost packet).
  • Partitions: The network may experience complete or partial partitions (links breaking), which directly impacts the design and logic of network applications.

Berkeley Sockets (The Primitives)

"Berkeley Sockets" is the standard set of functions (primitives) used to program network applications.

Primitive Meaning
Socket Create a new communication endpoint.
Bind Attach a local address (IP/Port) to a socket.
Listen Announce willingness to accept connections (Server side).
Accept Block the caller until a connection request actually arrives.
Connect Actively attempt to establish a connection (Client side).
Send/Receive Transfer data over the connection.
Close Release the connection.

image.png

When the code calls the accept function, the program stops running at that exact line. It does not move to the next line of code, and it does not use up CPU processing power. It effectively goes to sleep and waits.

  • If no one is connecting: The program sits there indefinitely (it is "blocked").
  • When a request arrives: The operating system wakes the program up, the accept function finishes, and the code finally moves to the next line to handle the new connection.

TCP Socket Flow

This diagram illustrates the lifecycle of a TCP connection between a Client and a Server.

  • Connection Setup:
    • Server: Creates a serverSocket and waits for a request.
    • Client: Creates a clientSocket and initiates the connection.
    • Handshake: A "TCP connection setup" occurs, during which Serial Numbers (Sequence Numbers) are initialized.
      • They don't give out all the numbers here.
      • They just agree: "Let's start counting at 0" (or a random number like 1000).
  • Data Transfer:
    • The client sends a request; the server reads it, processes it, and writes a reply.
    • Serial Numbers are used throughout this process to identify packets, ensure correct order, and handle packet loss.
  • Teardown:
    • Both the client and server close their respective sockets to end the session.

image.png