Class 12: Kernel Debugging¶
Date: 20.05.2025
Resources¶
kernel config with various debugging options enabled:
config
a qemu image with a compiled kernel: https://students.mimuw.edu.pl/ZSO/PUBLIC-SO/debugging/zso2025_debug.qcow2.gz (available on students at
/home/students/inf/PUBLIC/ZSO/debugging/zso2025_debug.qcow2
)the build directory with the compiled kernel (for kgdb): https://students.mimuw.edu.pl/ZSO/PUBLIC-SO/debugging/linux-6.12.6.tar.gz (available on students at
/home/students/inf/PUBLIC/ZSO/debugging/linux-6.12.6
)a working
crash
utility: https://students.mimuw.edu.pl/ZSO/PUBLIC-SO/debugging/crash.gz (available on students at/home/students/inf/PUBLIC/ZSO/debugging/crash
, also included in the qemu image)
The qemu image is incremental, and its backing file is the image from QEMU. If you want to use the image on your own machine, you can change the backing file path by running qemu-img rebase -u -b <your_backing_file_path> zso2025_debug.qcow2
, where <your_backing_file_path>
is a path to your identical copy of the image from QEMU.
Additional materials¶
Debugging by Printing¶
Using printk()
for debugging kernel space code is analogous to using printf()
for debugging user space code. It is very easy to use without special setup (requires CONFIG_PRINTK
, which is usually enabled), making it especially good for quick checks.
printk()
writes to a circular buffer - when the buffer fills up, new messages will overwrite the oldest ones. The size of the buffer is specified in CONFIG_LOG_BUF_SHIFT
(max 32 MiB) and can be overriden using the log_buf_len
kernel boot parameter (max 2 GiB). The size is always a power of 2. The buffer can be read using the dmesg
command.
Instead of using printk()
directly, one can use several convenience macros for each log level, such as pr_info()
, pr_err()
, etc. These macros differ from printk()
in that they first apply the pr_fmt()
macro to the format string. By default, pr_fmt()
does nothing, but one can define it to, for example, add a custom header to each message. Note that pr_fmt()
must be defined before printk.h
gets included. Additionally, pr_debug()
and pr_devel()
are conditionally compiled: only if DEBUG
is defined.
Additionally, there is printk_once()
, which prints a message only once, and printk_ratelimited()
, which prints messages at a limited rate. Both of the above macros also have corresponding pr_*()
convenience macros. Note that the files /proc/sys/kernel/printk_ratelimit
and /proc/sys/kernel/printk_ratelimit_burst
, control only the printk_ratelimit()
function (not printk_ratelimited()
), whose all call sites share the limiting state. All other rate-limited print functions have their parameters hardcoded in the kernel.
Lastly, all the aforementioned macros have dev_*
versions (e.g. dev_printk
, dev_info()
) that should be used in device drivers and contain additional information about the device. Other useful macros are print_hex_dump()
, print_hex_dump_debug()
and print_hex_dump_bytes()
.
The kernel's log level can be controlled either via /proc/sys/kernel/printk
or using the boot parameters loglevel
or ignore_loglevel
(see https://www.kernel.org/doc/html/latest/core-api/printk-basics.html).
References:
https://www.kernel.org/doc/html/latest/process/debugging/driver_development_debugging_guide.html#id2
Dynamic Debug¶
If CONFIG_DYNAMIC_DEBUG
is set, the functions pr_debug()
, dev_dbg()
, print_hex_dump_debug()
and print_hex_dump_bytes()
use dynamic debugging: instead of using DEBUG
to enable them at compile time, they can be enabled dynamically. For instructions on how to control dynamic debugging, see https://www.kernel.org/doc/html/latest/admin-guide/dynamic-debug-howto.html (especially the Examples section can be helpful; ddcmd
is an alias defined earlier in the document).
Hands-on
set
log_buf_len
to 64 MiBadd a custom header to messages printed in the hello device from Class 9: Character Devices
add a rate-limited warning message to
hello_open()
add an info message to
hello_release()
that gets printed only oncein
hello_read()
, print a hexdump of the read bytesadd a dynamic debug message to
hello_ioctl()
verify that your changes work correctly
Kernel debuggers¶
The kernel has two debugger frontends: kdb and kgdb. kgdb is much more powerful - it allows you to use gdb with additional scripts for inspecting the kernel state. kdb currently allows for setting breakpoints and single-stepping, and has some of the kernel inspection capabilities of kgdb. It no longer has the option to display code disassembly. One advantage of kdb, though, is that only kdb can be run on the machine being debugged (i.e. without a second machine).
If you want to use kgdb, the following kernel options are necessary or recommended: CONFIG_KGDB=y
, CONFIG_KGDB_SERIAL_CONSOLE=y
, CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
, CONFIG_GDB_SCRIPTS=y
, CONFIG_DEBUG_INFO_REDUCED=n
.
For kdb, the options are: CONFIG_KGDB=y
, CONFIG_KGDB_KDB=y
, CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y
, CONFIG_KDB_KEYBOARD=y
, CONFIG_KGDB_SERIAL_CONSOLE=y
, CONFIG_DEBUG_INFO_REDUCED=n
.
Hands-on
follow the guide to debug the kernel with kgdb: https://www.kernel.org/doc/html/latest/process/debugging/gdb-kernel-debugging.html
any necessary kernel config options are already set in the provided image, but you need to set
nokaslr
run kgdb from the build directory provided in Resources
try the commands from the guide
to observe loading a module, you can use
dummy
, which is available in the provided kernel (modprobe dummy
)note that the interface of
lx_per_cpu()
has changed slightly: unlike in the guide, the name of the per-cpu variable should be provided without quotes (and tab-completion works)play with other kgdb commands (run
apropos lx
for a list)use gdb functions (cheatsheets such as https://github.com/reveng007/GDB-Cheat-Sheet are useful)
set breakpoints and try to trigger them, e.g. set a breakpoint on
do_open()
and then read a fileexamine the stack trace at a breakpoint, in particular the arguments and local variables of the different functions in the stack trace
examine the registers
examine the surrounding code in C and assembly
single-step through the function at source code and assembly level, enter some of the function calls. Try different layouts (asm, src, split)
run the function until return
set a watchpoint on a variable and trigger it
set a conditional breakpoint and trigger it
delete and disable some breakpoints
print an expression
print a type definition
debug the kernel with kdb
to enter kdb, you need to configure its I/O and then trigger it with SysRq. You can do this by running
echo kbd > /sys/module/kgdboc/parameters/kgdboc; echo g > /proc/sysrq-trigger
see what you can do with kdb, in particular which of the previous steps you can repeat (run
help
for a command list)
References:
Crash dumps¶
A bug may cause the kernel to crash or hang. In such a case, generating a crash dump that contains information about the kernel's state at the time of the crash can be helpful. Crash dumps can be generated using kdump, which is a mechanism that utilizes kexec to boot a second kernel that captures the crash dump in case of a crash.
There are several ways to force a kernel crash on purpose. The simplest one is by issuing the magic SysRq command c
. To issue this command to the VM you cannot use the keyboard normally, since the command would be interpreted by the host; one way of issuing the command that works on the VM is by writing to /proc/sysrq-trigger
.
Another way to simulate errors and crashes is by using the Linux Kernel Dump Test Module (LKDTM) (CONFIG_LKDTM
). It can be controlled from DebugFS (see https://www.kernel.org/doc/html/latest/fault-injection/provoke-crashes.html).
Hands-on
enable crash dumps by installing
kdump-tools
via aptselect No for
kexec-tools
handling reboots and Yes for enablingkdump-tools
since the kernel with many debugging options enabled requires more memory, after installing you need to increase the crashkernel size by changing
crashkernel=384M-:128M
tocrashkernel=384M-:256M
in/etc/default/grub.d/kdump-tools.cfg
and runningsudo update-grub
then reboot
force a crash using Magic SysRq
force some erros and crashes using the LKDTM. Experiment with different ones, such as
WARNING
,LOOP
,PANIC
,BUG
, etc. (a full list can be found by reading the file/sys/kernel/debug/provoke-crash/DIRECT
). Try different crash points, such asDIRECT
(trigger immediately) andINT_HW_IRQ_EN
(trigger onhandle_irq_event()
).examine both crash dumps using the
crash
commandthe
crash
installed from the Debian repository seems to have some issue with gdb crashing, so instead use the provided/home/zso/crash
pass the
vmlinux
from the kernel compilation directory tocrash
examine the dump using the different available commands (see
man crash
orhelp
in thecrash
prompt). In particular check the backtrace, processes, machine information, the kernel log, registers and the failing code. Remember that you can run gdb commands insidecrash
(if a gdb command name conflicts with a crash command, run it asgdb <command>
)
References:
Stack traces¶
If the kernel detects a bug and does not crash, it prints a stack trace. The stack trace contains information such as the function call trace, register values and loaded modules. Some scripts in the kernel source code help working with stack traces:
scripts/decodecode
- disassembles the code bytes printed by kernel oopsesscripts/decode_stacktrace.sh
- converts byte offsets in the function call trace to line numbers
Hands-on
use
decode_stacktrace.sh
(which also usesdecodecode
) to examine the stacktraces from the crashes you triggered (take the stacktraces from the dmesg dumps)
References:
Runtime error checkers¶
The kernel has sever mechanisms for error checking at runtime, in particular:
KASAN (Kernel Address Sanitizer) - helps find use-after-free and out-of-bounds bugs (
CONFIG_KASAN
). Alternatively, you can use KFENCE (Kernel Electric-Fence), which has lower precision, but also a lower overhead that makes it suitable for production code.UBSAN (Undefined Behavior Sanitizer) - detects undefined behavior, such as bit shift by a negative (
CONFIG_UBSAN
) valuelockdep (Lock Dependency Validator) - detect potential deadlocks and other locking-related issues (
CONFIG_PROVE_LOCKING
)KCSAN (Kernel Concurrency Sanitizer) - detects data races (
CONFIG_KCSAN
)Kmemleak (Kernel Memory Leak Detector) - detects memory leaks (by default scans the memory every 10 minutes) (
CONFIG_DEBUG_KMEMLEAK
)KMSAN (Kernel Memory Sanitizer) - detects uses of uninitialized values (
CONFIG_KMSAN
)
Hands-on
introduce bugs into the hello module that will be detected by the above checkers and verify that they are detected (note that in the provided kernel KCSAN and KMSAN are disabled, since they are incompatible with KASAN)
DebugFS¶
DebugFS (CONFIG_DEBUG_FS
) allows to easily expose kernel variables to user space for read or write access via files under /sys/kernel/debug
. Any struct file_operations
can be provided for these files, but there are also convenient helpers for creating files that access integer variables or (read-only) binary blobs and blocks of registers. See https://www.kernel.org/doc/html/latest/process/debugging/driver_development_debugging_guide.html#id9 and https://www.kernel.org/doc/html/latest/filesystems/debugfs.html for details.
Note that if you need to transfer large quantities of data from the kernel to user space, DebugFS can be used in conjunction with the relay interface to create a circular buffer that can be written to by the kernel and read in user space by reading a DebugFS file (see https://docs.kernel.org/filesystems/relay.html).
Hands-on
expose the
hello_repeats
variable from the hello module for read-write access via DebugFSverify that the file works correctly
Fault injection¶
Some kernel functions support fault injection: they can be forced to return an error regardless of whether there was an actual error. The fault injection mechanism can be controlled via DebugFS and provides several parameters that specify when and how the injection should happen, for example depending on the stacktrace (see https://www.kernel.org/doc/html/latest/fault-injection/fault-injection.html).
The related kernel config options are: CONFIG_FAULT_INJECTION
, CONFIG_FAULT_INJECTION_DEBUG_FS
; configs for each failure type, e.g. CONFIG_FAILSLAB
for slab allocation failures, CONFIG_FUNCTION_ERROR_INJECTION
for injecting specific error return values.
Hands-on
inject a single user memory access failure with verbosity set to 2
examine the kernel log