Class 1: BPF¶

Date: 24.02.2026

Extras¶

QEMU

Scenario¶

Introduction¶

Welcome to the course on advanced operating systems! During the laboratory, you will learn how a modern operating system really works, and how to modify it to add new features.

Tip

Important links:

The main site: https://students.mimuw.edu.pl/ZSO/
Labs: https://students.mimuw.edu.pl/ZSO/PUBLIC-SO/2025-2026/
Moodle: https://moodle.mimuw.edu.pl/course/view.php?id=2819
Slack: https://app.slack.com (you should get an invitation)

Passing this course will require implementing three relatively large assignments [1], that will gradually introduce you to the internals of the operating system. In addition to that, there will be small assignments that will help you solve the large ones twofold:

they will introduce you to the necessary concepts and tools
they will also give you extra days for the large assignments, which are often crucial for debugging.

The deadline for each small assignment is the Sunday before the next lecture. The schedule of large assignments is available at https://students.mimuw.edu.pl/ZSO/. The solutions that are suspected of plagiarism or being AI-generated will be subject to additional verification, including an oral examination.

Tip

Extra days are also given for attending lectures, so try to be present as much as possible!

We will work with Linux (specifically, kernel 6.18.5 [2] that was released earlier this year). You will be expected to understand the kernel code and modify it to implement the required features. We will work with four types of code:

user-space code, which is the code of regular applications and libraries
kernel code, which is the implementation of the Linux kernel itself
kernel modules, which are pieces of code that can be loaded into the kernel at runtime
bpf code, which is a special type of code that runs in kernel space but doesn't require recompiling the kernel or loading a module

Whereas students are expected to be familiar with user-space code, during this course we will mostly focus on the ways of interacting with the kernel. One of the most common ways of doing that is through system calls, which allow requesting the kernel to perform some operation on your behalf.

Hands-on

strace is a tool that allows you to trace system calls made by a process. Call strace ls / to see the system calls made by the ls command when listing the root directory.

For the majority of the labs, we will be working with the Linux kernel code and modules. The kernel code is implemented mostly in C but it's very different from the user-space code you may be used to. There is no LIBC in the kernel, because LIBC depends on the kernel. Therefore, the kernel provides its own implementations of many functions, for instance, there is kmalloc instead of malloc, and printk instead of printf.

The BPF code is the main topic of this laboratory, and the first large assignment, therefore we will discuss it in more detail soon. Before that, let's set up the environment. To provide a coherent working environment for everyone, we will use a virtual machine based on QEMU.

QEMU¶

We are going to use QEMU for virtualization in this course. Using QEMU simplifies debugging changes to the kernel by a lot.

Important

Please verify during this week that you can use QEMU with the provided image, as some of you may have trouble with your machine (especially on non-x86 CPUs or with disabled hypervisor). The safe bet is to use computers in the labs, as you cannot use long-running processes on the students machine.

If you're stuck, feel free to ask us questions on Slack!

You'll find details on starting QEMU here: QEMU.

What is BPF?¶

BPF (Berkeley Packet Filter) is a technology that allows user-space processes to supply filtering programs. In short, BPF enables writing small programs (which are not kernel modules) that execute in kernel mode. A simple example of a BPF program (available in man 2 bpf) is a filter that counts TCP and UDP packets received by the operating system.

BPF has many practical applications, including security, tracing and profiling processes, managing network interfaces, and system monitoring [3]. This technology is gaining popularity; for instance, in 2019, Netflix used 15 BPF programs by default, while Facebook used 40 in production [5], and in recent years the technology has been developing rapidly - the linux/kernel/bpf directory was modified by almost 500 commits in 2025.

One unquestionable advantage of BPF is its high efficiency, allowing execution of relatively simple programs for every packet at 10Gb/s speeds without noticeable delays [7]. However, BPF programs are not necessarily faster than their in-kernel equivalents [8]; their main feature is enabling execution of user-supplied code in kernel mode. Since this can obviously pose security risks, BPF programs run in a sandbox environment after being verified, as we will discuss shortly.

Regarding terminology, the BPF abbreviation originates from the 1993 publication "The BSD Packet Filter" [9]. Linux 3.18 introduced extended BPF (eBPF) with e.g., 64-bit registers support, and the older version started being referred to as cBPF (classic BPF). Nowadays, the technology is generally referred to as BPF, though the term eBPF can still be encountered [4].

Tip

Useful links:

Introduction to eBPF
Kernel BPF documentation
eBPF on Linux -- a bit nicer than the above
BPF and XDP Reference Guide with technical details

Types of BPF Programs¶

BPF programs can be of various types, specified in enum bpf_prog_type in include/uapi/linux/bpf.h.

Kernel version 6.18 contains over 30 types, of which some of the more important are:

BPF_PROG_TYPE_SOCKET_FILTER for dropping or modifying packets
BPF_PROG_TYPE_KPROBE for function instrumentation
BPF_PROG_TYPE_XDP to decide the fate of packets early in their processing (before performing costly operations), which is useful for DDoS protection
BPF_PROG_TYPE_CGROUP_* for additional cgroup permission management

Creating BPF Programs¶

BPF programs resemble assembly but have their own register set and instruction set. They provide 11 registers to the programmer: R0-R9 (read/write) and R10 (read-only stack frame pointer, similar to RBP in x86_64). Registers are modified by numerous instructions [10], which allow for:

arithmetic operations (e.g. BPF_ADD, BPF_MUL),
jumps and function calls (e.g. BPF_JEQ, BPF_JLE, BPF_CALL),
loading and storing values (e.g. BPF_LD, BPF_ST).

One of the ways to write a BPF program is to manually use the struct bpf_insn (like in samples/bpf/bpf_insn.h). However, this has obvious disadvantages (just like writing programs in assembly), so there are tools available for writing BPF programs in programming languages such as C, C++, Python, or Go, including bcc [11] and libbpf.

Since BPF programs are executed in a sandbox, they cannot (typically) call arbitrary kernel functions and have limited options for interaction with the outside world. Instead, they use helper functions, whose capabilities include:

simple printing (bpf_trace_printk),
retrieving context information (e.g. bpf_get_current_uid_gid),
communicating with user space through various types of associative arrays (bpf_map_*),
performing operations specific to the program type (e.g., dropping a packet),
invoking other BPF programs (bpf_tail_call).

A prepared BPF program is verified by the kernel and then compiled using JIT (just-in-time compilation) into machine code. The code responsible for compilation for the x86 architecture is located in the file arch/x86/net/bpf_jit_comp.c. After compilation, the BPF program can be executed. The BPF program is typically accompanied by a user-space program that mediates communication with it.

For the documentation of helper functions, see the eBPF Docs and manpage.

Kernel Tracing¶

Linux has multiple facilities for tracing and observing what is happening in it. The most important pieces include tracepoints, ftrace, Kprobes. In short, they allow hooking (placing probes) at various places. Of special interest are dynamic traces, which allow hooking at runtime with virtually no overhead otherwise.

The idea of tracepoints is straightforward: we explicitly place code checking if a probe is connected, and if so, call it with some arguments. Function tracing with ftrace is a bit trickier, as we need help from the compiler to put a stub call at each function entry. With dynamic ftrace on x86, you can notice a call to __fentry__ at almost every function. (Check it yourself with objdump --disassemble=vfs_write vmlinux | less) The function entry hook is also used to place a function exit hook: we just need to replace the return pointer on the stack with a pointer to a specially crafted trampoline. As an extra optimization, the kernel will self-modify and replace these calls with NOPs until they are needed.

Kprobes are more powerful, as they allow hooking at individual instructions. In principle, they work by replacing the instruction in question with a breakpoint instruction to redirect the execution flow, then executing the instruction there along with registered probes, and returning to the main flow.

Hands-on

First, check if you have enabled necessary kernel features with bpftool:

bpftool feature probe

bpftool is developed alongside libbpf in the main kernel tree.

bpftrace¶

bpftrace is a tool enabling quick hacking and prototyping around BPF probe facilities. A short, yet very effective introduction on how to use bpftrace can be found in the bpftrace GitHub repository [12].

Hands-on

You may check a list of all available probes with:

bpftrace -l

Go ahead and run your first BPF program with something like:

bpftrace -e 'kprobe:do_nanosleep { printf("PID %d sleeping...\n", pid); }'

Then execute sleep in another terminal. Keep the trace running, as we will examine it with bpftool:

bpftool prog list

will list currently installed BPF programs. You can see the BPF instructions (after initial translation by the kernel) with:

bpftool prog dump xlated id <id>
# or, in this specific case, just:
bpftool prog dump xlated name do_nanosleep

You can see what maps are being used with bpftool map:

bpftool map

In this case, bpftrace uses perf_event_array to implement its printf. You may read these events with:

bpftool map event_pipe id <id>

Small Assignment #1¶

For the first task assignment, you need to write a BPF program using bpftool that counts the number of times each signal number was sent to processes in the system, and plot a histogram of the results.

Class 1: BPF¶

Extras¶

Scenario¶

Introduction¶

QEMU¶

What is BPF?¶

Types of BPF Programs¶

Creating BPF Programs¶

Kernel Tracing¶

bpftrace¶

Small Assignment #1¶

References¶