Processes¶

The Process Abstraction¶

Definition

A process is an instance of a running program.

Complications: sometimes what a user thinks is a “running program” does not match how it is actually implemented. I.e. it is not always a one-to-one match.
Each time you run a program, at least one new process is created, but not necessarily exactly one—some programs may:
- Spawn multiple processes (e.g., a browser may launch helper processes for tabs or extensions).
- Reuse an existing process (e.g., some apps open in a single shared process).
- Create child processes to handle parts of their task.

Definition

A process is an environment in which an application program runs (as opposed to a program which is an executable file in secondary storage e.g. SSD).

Processes are created and managed by the kernel.
In terms of implementation, a process is a struct that has fields for the various resources and performance data the kernel tracks about processes.
- Each process is given a unique id, called a process ID, or simply a pid. The pid is a non-negative number.
- A user can have two different processes running the same program (two different instances of the same program), which will have different pid.

A Process’s View of the Computer¶

Processes are isolated from one another in the sense that it seems like each process has exclusive access to the processor, main memory, secondary storage, etc.
A bug in one program should not affect the stability or outcome of other processes.
Each process uses its own virtual memory, so the same virtual address in different processes (e.g., P1 and P2) maps to different physical locations, ensuring memory isolation between them.
This isolation simplifies programming model.

Process Management System Calls¶

The two views of a process are

The user view of processes
- Specifically some process management functions, e.g. how to create, kill, and communicate between processes.
The kernel view of processes
- Implementing processes in the kernel

User View¶

Creating and Waiting for Processes¶

int fork(void);
- Creates a duplicate process (child) of the calling process (parent).
- Returns:
  - Child process gets 0
  - Parent process gets child’s PID
- Runs once but returns twice (once in parent, once in child)
int waitpid(int pid, int *status, int options);
- Used by parent to wait for a specific child (or any child) to finish.
- Parameters:
  - pid: child's PID or -1 to wait for any child
  - status: pointer to store child's exit status or signal
  - options: e.g. 0 (block until child ends), WNOHANG (don’t block)
- Returns:
  - PID of terminated child or -1 on error

Deleting Processes¶

void exit(int status);
- Terminates the current process (similar to returning from main).
- Parameters:
  - status: exit code that the parent can retrieve via waitpid
    - 0 indicates success
    - Non-zero indicates error
- Result:
  - Process ends and status becomes available to parent if it calls waitpid.
int kill(int pid, int sig);
- Sends a signal to a specific process.
- Parameters:
  - pid: process ID of the target process
  - sig: signal to send (e.g., SIGTERM, SIGKILL)
- Notes:
  - A signal is a software interrupt that can control process behavior.
  - SIGTERM: graceful termination (can be handled by the process, for example, to clean up)
  - SIGKILL: immediate termination (cannot be ignored or caught)
- Returns:
  - 0 on success
  - 1 on error (e.g., invalid pid or insufficient permissions)

execv and getpid system calls¶

int getpid(void);
- Returns the process ID (PID) of the calling process.
- Each process has a unique PID used to identify it.
int execv(const char *path, char *const argv[]);
- Replaces the current process image with a new program.
- The calling process’s current code, data, stack, etc., are destroyed.
- The process gets new memory loaded with the new program.
- After execv, the new program starts executing immediately.
- The PID remains the same (process is reused).
- Arguments for the new program can be passed via argv.

The fork, exit, getpid and waitpid system calls¶

// sample code that uses fork, getpid, waitpid and exit
main() {
    int rc = fork();             // returns 0 to child, pid to parent
    if (rc == 0) {               // child executes this code
        my_pid = getpid();
        x = child_code();
        exit(x);
    } else {                     // parent executes this code
        int child_exit;
        child_pid = rc;
        parent_pid = getpid();
        parent_code();
        p = waitpid(child_pid, &child_exit, 0);
        if (WIFEXITED(child_exit))
            printf("child exit status was %d", WEXITSTATUS(child_exit));
    }
}

WIFEXITED returns true if the child called exit()

WEXITSTATUS returns the exit status code.

The status is not directly the exit code — it’s a bit-encoded value that must be decoded using macros.

execv example¶

// sample code that uses execv to execute the command:
// /testbin/argtest first second
int main() {
    int status = 0;         // status of execv function call
    char *args[4];          // argument vector

    // prepare the arguments
    args[0] = (char *) "/testbin/argtest";
    args[1] = (char *) "first";
    args[2] = (char *) "second";
    args[3] = 0;            // end of args

    status = execv("/testbin/argtest", args);
    printf("If you see this output then execv failed");
    printf("status = %d errno = %d", status, errno);
    exit(0);
}

if execv:

fails ⇒ the current program will continue executing
succeeds ⇒ the current program will be replaced with the program argtest

errno is a global variable that holds the value of the last error number

Combining fork and execv¶

// the child executes execv while the parent waits
main() {
    int child_exit;           // exit status of child
    char * args[4];           // args for argtest
    // set up args here as in previous example
    int rc = fork();          // returns 0 to child, pid to parent
    if (rc == 0) {            // child’s code
        int status = execv("/testbin/argtest", args);
        printf("If you see this output, then execv failed");
        printf("status = %d errno = %d", status, errno);
        exit(0);
    } else {                  // parent’s code
        child_pid = rc;
        parent_code();
        p = waitpid(child_pid, &child_exit, 0); // 0 means wait for child to exit and get exit status code,
                                                                                        // WNOHANG means just get the exit staus code
    }
}

System Calls¶

Defintion

System calls are the application programming interface (API) that programmers use to interact with the kernel.

Applications do not call kernel functions directly! System calls are special instructions (similar to a function call) that transfers control to the kernel, which handles the task.

A function call within the same privilege level (user space or kernel space) is just a regular function call.

If execution crosses from user space to kernel space, it’s a system call.

System Call Interface¶

The application calls a library wrapper function, e.g. fprintf, that copies the arguments into the appropriate registers
and then executes a system call, moving from user space to kernel space.
The kernel executes the system call
and returns to the wrapper function, which is in user space,
which then returns the results to the application program.

Privileged Code

Definition: Code that runs in a privileged mode (also called kernel mode or supervisor mode).
Capabilities:
- Direct access to hardware (CPU, memory, devices).
- Can execute sensitive instructions, like modifying control registers, I/O operations, and managing memory.
- Full access to kernel resources (e.g., process scheduling, file systems).
Used By: The operating system kernel and some low-level system services.

Unprivileged Code

Definition: Code that runs in user mode, with restricted access to resources.
Capabilities:
- Cannot directly access hardware or execute privileged instructions.
- Must use system calls to request services from the kernel.
Used By: User applications and general-purpose software.

Privilege Modes¶

Modern processors support multiple privilege modes to separate user programs from the operating system kernel.
Kernel Mode (Privileged Mode):
- Used by the OS.
- Can execute all CPU instructions and access all memory.
- Can:
  - Enable/disable interrupts
  - Modify hardware configurations
  - Access privileged registers
  - Modify the TLB (virtual memory)
  - Communicate directly with devices (e.g., mouse, keyboard)
User Mode:
- Used by regular application programs.
- Cannot access kernel memory or execute privileged instructions.
- Must use system calls to request OS services.
It provides protection:
- User programs can’t crash or corrupt the OS.
- Processes are isolated from each other for safety.

Mode Transitions¶

Kernel Mode can only be entered via well-defined entry points, which are built into the processor.
There are two main triggers for mode switches:

Interrupts¶

Generated by devices to signal that they need attention
Examples: keyboard input, mouse movement, incoming network packets.
The CPU switches to privileged mode to run the handler.
Interrupt Process:
- Devices raise an interrupt by changing voltage on a physical line or sending a bus message.
- The CPU pauses the current program and jumps to a kernel-level interrupt handler.
- It begins executing the interrupt handler in privileged mode.
Most interrupts can be disabled, but not all.
- Non-maskable interrupts (NMI) cannot be disabled—used for critical events like hardware failure or overheating.

Exceptions¶

Conditions discovered by the processor while executing an instruction
Examples:
- Divide by zero
- Page fault (data not in RAM)
- Illegal instructions
Triggered during program execution, often due to:
- Programming errors (e.g., divide by zero)
- Operating system requests (e.g., page fault)
- Hardware failures
The processor stops the instruction that triggered the exception (usually) and jumps to an exception handler in kernel mode.
System calls are implemented using exceptions—software triggers them intentionally to request OS services.
Note

A system call is a kind of exception because:
1. It is internal:
  - It originates from inside the CPU, due to a special instruction (like syscall, int, or svc).
2. It is synchronous:
  - It happens at a specific point in the program — not randomly.
  - It is tied to the execution of a particular instruction.
3. It is caused by the program:
  - The user program deliberately executes a system call to request a service from the OS (e.g., file I/O, process creation).

Interrupts and exceptions cause the processor to transfer control to the interrupt/exception handlers

Key Differences

Feature	Interrupt	Exception
Cause	External to the CPU (hardware)	Internal to the CPU (software or hardware)
Examples	Keyboard input, timer, disk I/O	Divide by zero, page fault, invalid opcode
Timing	Asynchronous (can happen anytime)	Synchronous (happens during instruction)
Initiated by	Devices or hardware controllers	Current instruction or CPU state
Handler Called	Based on interrupt vector (hardware)	Based on exception table (trap/fault handler)
Program’s Role	Program is unaware; OS must respond	Program causes it (e.g., illegal instruction)

x64 Processors¶

%rax: Accumulator register (used for arithmetic and return values)
%rbx: Base pointer for memory access (e.g., array indexing)
%rcx: Counter register for loops
%rdx: Data register for I/O and arithmetic
%rdi: Destination index (used for copying data)
%rsi: Source index (used for copying data)
%rbp: Base/frame pointer (points to bottom of current stack frame)
%rsp: Stack pointer (points to top of stack)
%rip: Instruction pointer (a.k.a. program counter; points to next instruction)
%r8 to %r15: Additional general-purpose registers

Caller and Callee-saved Registers¶

Caller-saved Registers¶

Not preserved across function calls.
The caller must save them (e.g., on the stack) if it needs the values later.
If not saved, the callee may overwrite them.
Used for temporary values that don’t need to survive a function call.

Callee-saved Registers¶

Must be preserved by the callee if it uses them.
The callee saves their original values at the start and restores them before returning.
Used when values need to persist across function calls.

This convention avoids unnecessary saving/restoring:

Caller saves what it needs.
Callee protects what the caller might expect to remain unchanged.

How are arguments passed?¶

Definition

The Application Binary Interface (ABI) defines the contract between an application’s functions and system calls (or more generally between any two machine code modules).

The calling convention specifies:
- How registers are used (which registers hold arguments or return values)
- Who is responsible for saving registers (caller vs. callee)
- Stack alignment rules (e.g., stack pointer must be divisible by 8)
Why it matters: OS and compilers must follow these conventions to ensure correct low-level function interactions.

x64 Calling Conventions¶

Caller-saved registers¶

Not preserved across function calls.
The caller must save them if needed before calling another function.
Examples:
- r10, r11: scratch registers
- rdi, rsi, rdx, rcx, r8, r9: argument registers
- rax, rdx: return values

Callee-saved registers¶

Preserved across function calls.
The callee must save and restore them if it uses them.
Examples:
- rbx, r12–r15: saved registers

rsp: Stack pointer
rbp: Frame pointer (used if compiled with fno-omit-frame-pointer)

Instructions¶

call: Calls a function and pushes the return address to the stack (like MIPS jalr)
ret: Returns from a function (like MIPS jr $31)

Functions in x64¶

Functions are called using the call instruction.
- This pushes the return address onto the stack and jumps to the target function.

foo:
    push %rbp           ; Save caller's frame pointer
    mov %rsp, %rbp      ; Set up current frame pointer to the top of stack

    # Save caller-save registers (if needed)
    call bar # Call the subroutine bar

    # Restore caller-save registers (if needed)
    pop %rbp
    ret # Return

Caller should save caller-saved registers if needed before call.
After the call, caller may restore them as needed.

System Calls¶

System Calls use the T_SYSCALL exception vector. It is called a vector, but it is really an integer.

System calls use the T_SYSCALL exception vector to transition from user mode to kernel mode.

Steps

The application loads the system call arguments into registers.
It stores the system call number in register %rdi (first argument).
Executes int 60, which triggers a software interrupt.
- int is the assembly instruction to raise an interrupt.
- 60 maps to T_SYSCALL in the interrupt vector table.
The processor looks up the interrupt vector 60 to find the corresponding handler address.
It jumps to that address (the kernel's system call handler).
Once the kernel finishes, it uses the iret instruction to return to user space and resume execution.

System Call Numbering¶

The system call number is passed as the first argument into syscall.

#ifndef __SYS_SYSCALL_H__
#define __SYS_SYSCALL_H__

#define SYSCALL_NULL        0x00
#define SYSCALL_TIME        0x01
#define SYSCALL_GETPID      0x02
#define SYSCALL_EXIT        0x03
#define SYSCALL_SPAWN       0x04
#define SYSCALL_WAIT        0x05

// Memory
#define SYSCALL_MMAP        0x08
#define SYSCALL_MUNMAP      0x09
#define SYSCALL_MPROTECT    0x0A

// Stream
#define SYSCALL_READ        0x10
#define SYSCALL_WRITE       0x11
#define SYSCALL_FLUSH       0x12

// File
#define SYSCALL_OPEN        0x18
#define SYSCALL_CLOSE       0x19
#define SYSCALL_MOVE        0x1A
#define SYSCALL_DELETE      0x1B
#define SYSCALL_SETLENGTH   0x1C
#define SYSCALL_STAT        0x1D
#define SYSCALL_READDIR     0x1E

// IPC
#define SYSCALL_PIPE        0x20

// Threading
#define SYSCALL_THREADCREATE    0x30
#define SYSCALL_GETTID      0x31
#define SYSCALL_THREADEXIT  0x32
#define SYSCALL_THREADSLEEP 0x33
#define SYSCALL_THREADWAIT  0x34

// Network
#define SYSCALL_NICSTAT     0x40
#define SYSCALL_NICSEND     0x41
#define SYSCALL_NICRECV     0x42

// System
#define SYSCALL_SYSCTL      0x80
#define SYSCALL_FSMOUNT     0x81
#define SYSCALL_FSUNMOUNT   0x82
#define SYSCALL_FSINFO      0x83

uint64_t Syscall_Entry(uint64_t syscall, uint64_t a1, uint64_t a2,
               uint64_t a3, uint64_t a4, uint64_t a5);

#define SYSCALL_PACK(_errcode, _val) (((uint64_t)_errcode << 32) | (_val))
#define SYSCALL_ERRCODE(_result) (_result >> 32)
#define SYSCALL_VALUE(_result) (_result & 0xFFFFFFFF)

#endif /* __SYS_SYSCALL_H__ */

x64 Exception Vectors¶

Interrupts, exceptions and system calls use the same mechanism, sometimes called a trap, i.e. “trap into the kernel.”

A trap is a software-generated, synchronous, intentional exception that transfers control from a user-level process to the operating system kernel.

#ifndef __TRAP_H__
#define __TRAP_H__

#define T_DE        0   /* Divide Error Exception */
#define T_DB        1   /* Debug Exception */
#define T_NMI       2   /* NMI Interrupt */
#define T_BP        3   /* Breakpoint Exception */
#define T_OF        4   /* Overflow Exception */
#define T_BR        5   /* BOUND Range Exceeded Exception */
#define T_UD        6   /* Invalid Opcode Exception */
#define T_NM        7   /* Device Not Available Exception */
#define T_DF        8   /* Double Fault Exception */
#define T_TS        10  /* Invalid TSS Exception */
#define T_NP        11  /* Segment Not Present */
#define T_SS        12  /* Stack Fault Exception */
#define T_GP        13  /* General Protection Exception */
#define T_PF        14  /* Page-Fault Exception */
#define T_MF        16  /* x87 FPU Floating-Point Error */
#define T_AC        17  /* Alignment Check Exception */
#define T_MC        18  /* Machine-Check Exception */
#define T_XF        19  /* SIMB Floating-Point Exception */
#define T_VE        20  /* Virtualization Exception */

#define T_CPU_LAST  T_VE

// IRQs
#define T_IRQ_BASE  32
#define T_IRQ_LEN   24
#define T_IRQ_MAX   (T_IRQ_BASE + T_IRQ_LEN - 1)

#define T_IRQ_TIMER (T_IRQ_BASE + 0)
#define T_IRQ_KBD   (T_IRQ_BASE + 1)
#define T_IRQ_COM1  (T_IRQ_BASE + 4)
#define T_IRQ_MOUSE (T_IRQ_BASE + 12)

// LAPIC Special Vectors
#define T_IRQ_SPURIOUS  (T_IRQ_BASE + 24)
#define T_IRQ_ERROR (T_IRQ_BASE + 25)
#define T_IRQ_THERMAL   (T_IRQ_BASE + 26)

#define T_SYSCALL   60  /* System Call */
#define T_CROSSCALL 61  /* Cross Call (IPI) */
#define T_DEBUGIPI  62  /* Kernel Debugger Halt (IPI) */

#define T_UNKNOWN   63  /* Unknown Trap */

#define T_MAX       64

typedef struct TrapFrame
{
    uint64_t    r15;
    uint64_t    r14;
    uint64_t    r13;
    uint64_t    r12;
    uint64_t    r11;
    uint64_t    r10;
    uint64_t    r9;
    uint64_t    r8;
    uint64_t    rbp;
    uint64_t    rdi;
    uint64_t    rsi;
    uint64_t    rdx;
    uint64_t    rcx;
    uint64_t    rbx;
    uint64_t    ds;
    uint64_t    rax;

    uint64_t    vector;
    uint32_t    errcode;
    uint32_t    _unused0;
    uint64_t    rip;
    uint16_t    cs;
    uint16_t    _unused1;
    uint16_t    _unused2;
    uint16_t    _unused3;
    uint64_t    rflags;
    uint64_t    rsp;
    uint16_t    ss;
    uint16_t    _unused4;
    uint16_t    _unused5;
    uint16_t    _unused6;
} TrapFrame;

void Trap_Init();
void Trap_InitAP();
void Trap_Dump(TrapFrame *tf);
void Trap_Pop(TrapFrame *tf);

#endif /* __TRAP_H__ */

Stack¶

The kernel does not rely on user-space data structures like the application's stack.
- Example: User stack could be corrupted or full (e.g., due to infinite recursion).
- The kernel may overflow the user stack.
- Avoid user accessing privilege data
Instead, the kernel uses its own stack, known as the kernel stack.

Kernel Stack

Maintained entirely by the kernel.
Used for executing kernel code, including:
- System calls
- Exceptions
- Interrupts

Trap Frame¶

trap_common: In Castor OS, both interrupts and exceptions are handled using this single routine.
When a function call is made, a stack frame is created to store:
- Arguments
- Local variables
- Register values
For interrupts (not just system calls), the system must save all register values, including:
- General-purpose registers
- Status registers
This is essential because interrupts can happen at any moment (e.g., incoming network packet), and the application might not save caller-save registers before the interrupt.
These saved registers are stored in a structure called a trap frame.
The trap frame is:
- Stored on the kernel stack
- Used to safely resume execution after handling the interrupt

A trap is a type of control transfer from user mode to kernel mode — it happens when a user program requests a service from the operating system, or when an exception or interrupt occurs.

Processes¶

The Process Abstraction¶

A Process’s View of the Computer¶

Process Management System Calls¶

User View¶

Creating and Waiting for Processes¶

Deleting Processes¶

execv and getpid system calls¶

The fork, exit, getpid and waitpid system calls¶

execv example¶

Combining fork and execv¶

System Calls¶

System Call Interface¶

Privilege Modes¶

Mode Transitions¶

Interrupts¶

Exceptions¶

x64 Processors¶

Caller and Callee-saved Registers¶

Caller-saved Registers¶

Callee-saved Registers¶

How are arguments passed?¶

x64 Calling Conventions¶

Caller-saved registers¶

Callee-saved registers¶

Stack-related registers¶

Instructions¶

Functions in x64¶

System Calls¶

System Call Numbering¶

x64 Exception Vectors¶

Stack¶

Trap Frame¶