.. _lab-debugging-en: ========================== Class 12: Kernel Debugging ========================== Date: 20.05.2025 Resources ========= - kernel config with various debugging options enabled: :download:`config` - a qemu image with a compiled kernel: https://students.mimuw.edu.pl/ZSO/PUBLIC-SO/debugging/zso2025_debug.qcow2.gz (available on students at ``/home/students/inf/PUBLIC/ZSO/debugging/zso2025_debug.qcow2``) - the build directory with the compiled kernel (for kgdb): https://students.mimuw.edu.pl/ZSO/PUBLIC-SO/debugging/linux-6.12.6.tar.gz (available on students at ``/home/students/inf/PUBLIC/ZSO/debugging/linux-6.12.6``) - a working ``crash`` utility: https://students.mimuw.edu.pl/ZSO/PUBLIC-SO/debugging/crash.gz (available on students at ``/home/students/inf/PUBLIC/ZSO/debugging/crash``, also included in the qemu image) The qemu image is incremental, and its backing file is the image from :ref:`qemu-en`. If you want to use the image on your own machine, you can change the backing file path by running ``qemu-img rebase -u -b zso2025_debug.qcow2``, where ```` is a path to your identical copy of the image from :ref:`qemu-en`. Additional materials ==================== - https://www.kernel.org/doc/html/latest/process/debugging/index.html - https://www.kernel.org/doc/html/latest/process/debugging/userspace_debugging_guide.html - https://www.kernel.org/doc/html/latest/process/debugging/driver_development_debugging_guide.html - https://www.oreilly.com/library/view/linux-device-drivers/0596005903/ch04.html - https://sergioprado.blog/debugging-the-linux-kernel-with-gdb/ Debugging by Printing ===================== Using ``printk()`` for debugging kernel space code is analogous to using ``printf()`` for debugging user space code. It is very easy to use without special setup (requires ``CONFIG_PRINTK``, which is usually enabled), making it especially good for quick checks. ``printk()`` writes to a circular buffer - when the buffer fills up, new messages will overwrite the oldest ones. The size of the buffer is specified in ``CONFIG_LOG_BUF_SHIFT`` (max 32 MiB) and can be overriden using the ``log_buf_len`` kernel boot parameter (max 2 GiB). The size is always a power of 2. The buffer can be read using the ``dmesg`` command. Instead of using ``printk()`` directly, one can use several convenience macros for each log level, such as ``pr_info()``, ``pr_err()``, etc. These macros differ from ``printk()`` in that they first apply the ``pr_fmt()`` macro to the format string. By default, ``pr_fmt()`` does nothing, but one can define it to, for example, add a custom header to each message. Note that ``pr_fmt()`` must be defined before ``printk.h`` gets included. Additionally, ``pr_debug()`` and ``pr_devel()`` are conditionally compiled: only if ``DEBUG`` is defined. Additionally, there is ``printk_once()``, which prints a message only once, and ``printk_ratelimited()``, which prints messages at a limited rate. Both of the above macros also have corresponding ``pr_*()`` convenience macros. Note that the files ``/proc/sys/kernel/printk_ratelimit`` and ``/proc/sys/kernel/printk_ratelimit_burst``, control only the ``printk_ratelimit()`` function (not ``printk_ratelimited()``), whose all call sites share the limiting state. All other rate-limited print functions have their parameters hardcoded in the kernel. Lastly, all the aforementioned macros have ``dev_*`` versions (e.g. ``dev_printk``, ``dev_info()``) that should be used in device drivers and contain additional information about the device. Other useful macros are ``print_hex_dump()``, ``print_hex_dump_debug()`` and ``print_hex_dump_bytes()``. The kernel's log level can be controlled either via ``/proc/sys/kernel/printk`` or using the boot parameters ``loglevel`` or ``ignore_loglevel`` (see https://www.kernel.org/doc/html/latest/core-api/printk-basics.html). References: - https://www.kernel.org/doc/html/latest/process/debugging/driver_development_debugging_guide.html#id2 - https://www.oreilly.com/library/view/linux-device-drivers/0596005903/ch04.html#linuxdrive3-CHP-4-SECT-2 Dynamic Debug ------------- If ``CONFIG_DYNAMIC_DEBUG`` is set, the functions ``pr_debug()``, ``dev_dbg()``, ``print_hex_dump_debug()`` and ``print_hex_dump_bytes()`` use dynamic debugging: instead of using ``DEBUG`` to enable them at compile time, they can be enabled dynamically. For instructions on how to control dynamic debugging, see https://www.kernel.org/doc/html/latest/admin-guide/dynamic-debug-howto.html (especially the Examples section can be helpful; ``ddcmd`` is an alias defined earlier in the document). .. admonition:: Hands-on - set ``log_buf_len`` to 64 MiB - add a custom header to messages printed in the hello device from :ref:`lab-chardev-en` - add a rate-limited warning message to ``hello_open()`` - add an info message to ``hello_release()`` that gets printed only once - in ``hello_read()``, print a hexdump of the read bytes - add a dynamic debug message to ``hello_ioctl()`` - verify that your changes work correctly .. notes:: - ``log_buf_len`` can be set in ``/etc/default/grub`` (then do ``sudo update-grub``) or during boot by pressing ``e`` in grub (to get grub to appear on boot, you need to set the timeout in ``/etc/default/grub.d/15_timeout.cfg`` - it overrides the timeout in ``/etc/default/grub``) Kernel debuggers ================ The kernel has two debugger frontends: kdb and kgdb. kgdb is much more powerful - it allows you to use gdb with additional scripts for inspecting the kernel state. kdb currently allows for setting breakpoints and single-stepping, and has some of the kernel inspection capabilities of kgdb. It no longer has the option to display code disassembly. One advantage of kdb, though, is that only kdb can be run on the machine being debugged (i.e. without a second machine). If you want to use kgdb, the following kernel options are necessary or recommended: ``CONFIG_KGDB=y``, ``CONFIG_KGDB_SERIAL_CONSOLE=y``, ``CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y``, ``CONFIG_GDB_SCRIPTS=y``, ``CONFIG_DEBUG_INFO_REDUCED=n``. For kdb, the options are: ``CONFIG_KGDB=y``, ``CONFIG_KGDB_KDB=y``, ``CONFIG_DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT=y``, ``CONFIG_KDB_KEYBOARD=y``, ``CONFIG_KGDB_SERIAL_CONSOLE=y``, ``CONFIG_DEBUG_INFO_REDUCED=n``. .. admonition:: Hands-on - follow the guide to debug the kernel with kgdb: https://www.kernel.org/doc/html/latest/process/debugging/gdb-kernel-debugging.html - any necessary kernel config options are already set in the provided image, but you need to set ``nokaslr`` - run kgdb from the build directory provided in Resources - try the commands from the guide - to observe loading a module, you can use ``dummy``, which is available in the provided kernel (``modprobe dummy``) - note that the interface of ``lx_per_cpu()`` has changed slightly: unlike in the guide, the name of the per-cpu variable should be provided without quotes (and tab-completion works) - play with other kgdb commands (run ``apropos lx`` for a list) - use gdb functions (cheatsheets such as https://github.com/reveng007/GDB-Cheat-Sheet are useful) - set breakpoints and try to trigger them, e.g. set a breakpoint on ``do_open()`` and then read a file - examine the stack trace at a breakpoint, in particular the arguments and local variables of the different functions in the stack trace - examine the registers - examine the surrounding code in C and assembly - single-step through the function at source code and assembly level, enter some of the function calls. Try different layouts (asm, src, split) - run the function until return - set a watchpoint on a variable and trigger it - set a conditional breakpoint and trigger it - delete and disable some breakpoints - print an expression - print a type definition - debug the kernel with kdb - to enter kdb, you need to configure its I/O and then trigger it with SysRq. You can do this by running ``echo kbd > /sys/module/kgdboc/parameters/kgdboc; echo g > /proc/sysrq-trigger`` - see what you can do with kdb, in particular which of the previous steps you can repeat (run ``help`` for a command list) References: - https://www.kernel.org/doc/html/latest/process/debugging/kgdb.html Crash dumps =========== A bug may cause the kernel to crash or hang. In such a case, generating a crash dump that contains information about the kernel's state at the time of the crash can be helpful. Crash dumps can be generated using kdump, which is a mechanism that utilizes kexec to boot a second kernel that captures the crash dump in case of a crash. There are several ways to force a kernel crash on purpose. The simplest one is by issuing the magic SysRq command ``c``. To issue this command to the VM you cannot use the keyboard normally, since the command would be interpreted by the host; one way of issuing the command that works on the VM is by writing to ``/proc/sysrq-trigger``. Another way to simulate errors and crashes is by using the Linux Kernel Dump Test Module (LKDTM) (``CONFIG_LKDTM``). It can be controlled from DebugFS (see https://www.kernel.org/doc/html/latest/fault-injection/provoke-crashes.html). .. admonition:: Hands-on - enable crash dumps by installing ``kdump-tools`` via apt - select No for ``kexec-tools`` handling reboots and Yes for enabling ``kdump-tools`` - since the kernel with many debugging options enabled requires more memory, after installing you need to increase the crashkernel size by changing ``crashkernel=384M-:128M`` to ``crashkernel=384M-:256M`` in ``/etc/default/grub.d/kdump-tools.cfg`` and running ``sudo update-grub`` - then reboot - force a crash using Magic SysRq - force some erros and crashes using the LKDTM. Experiment with different ones, such as ``WARNING``, ``LOOP``, ``PANIC``, ``BUG``, etc. (a full list can be found by reading the file ``/sys/kernel/debug/provoke-crash/DIRECT``). Try different crash points, such as ``DIRECT`` (trigger immediately) and ``INT_HW_IRQ_EN`` (trigger on ``handle_irq_event()``). - examine both crash dumps using the ``crash`` command - the ``crash`` installed from the Debian repository seems to have some issue with gdb crashing, so instead use the provided ``/home/zso/crash`` - pass the ``vmlinux`` from the kernel compilation directory to ``crash`` - examine the dump using the different available commands (see ``man crash`` or ``help`` in the ``crash`` prompt). In particular check the backtrace, processes, machine information, the kernel log, registers and the failing code. Remember that you can run gdb commands inside ``crash`` (if a gdb command name conflicts with a crash command, run it as ``gdb ``) References: - https://docs.kernel.org/admin-guide/kdump/kdump.html - https://www.cyberciti.biz/faq/how-to-on-enable-kernel-crash-dump-on-debian-linux/ - https://linux.die.net/man/8/crash Stack traces ============ If the kernel detects a bug and does not crash, it prints a stack trace. The stack trace contains information such as the function call trace, register values and loaded modules. Some scripts in the kernel source code help working with stack traces: - ``scripts/decodecode`` - disassembles the code bytes printed by kernel oopses - ``scripts/decode_stacktrace.sh`` - converts byte offsets in the function call trace to line numbers .. admonition:: Hands-on - use ``decode_stacktrace.sh`` (which also uses ``decodecode``) to examine the stacktraces from the crashes you triggered (take the stacktraces from the dmesg dumps) References: - https://www.kernel.org/doc/html/latest/admin-guide/bug-hunting.html Runtime error checkers ====================== The kernel has sever mechanisms for error checking at runtime, in particular: - `KASAN (Kernel Address Sanitizer) `_ - helps find use-after-free and out-of-bounds bugs (``CONFIG_KASAN``). Alternatively, you can use `KFENCE (Kernel Electric-Fence) `_, which has lower precision, but also a lower overhead that makes it suitable for production code. - `UBSAN (Undefined Behavior Sanitizer) `_ - detects undefined behavior, such as bit shift by a negative (``CONFIG_UBSAN``) value - `lockdep (Lock Dependency Validator) `_ - detect potential deadlocks and other locking-related issues (``CONFIG_PROVE_LOCKING``) - `KCSAN (Kernel Concurrency Sanitizer) `_ - detects data races (``CONFIG_KCSAN``) - `Kmemleak (Kernel Memory Leak Detector) `_ - detects memory leaks (by default scans the memory every 10 minutes) (``CONFIG_DEBUG_KMEMLEAK``) - `KMSAN (Kernel Memory Sanitizer) `_ - detects uses of uninitialized values (``CONFIG_KMSAN``) .. admonition:: Hands-on - introduce bugs into the hello module that will be detected by the above checkers and verify that they are detected (note that in the provided kernel KCSAN and KMSAN are disabled, since they are incompatible with KASAN) DebugFS ======= DebugFS (``CONFIG_DEBUG_FS``) allows to easily expose kernel variables to user space for read or write access via files under ``/sys/kernel/debug``. Any ``struct file_operations`` can be provided for these files, but there are also convenient helpers for creating files that access integer variables or (read-only) binary blobs and blocks of registers. See https://www.kernel.org/doc/html/latest/process/debugging/driver_development_debugging_guide.html#id9 and https://www.kernel.org/doc/html/latest/filesystems/debugfs.html for details. Note that if you need to transfer large quantities of data from the kernel to user space, DebugFS can be used in conjunction with the relay interface to create a circular buffer that can be written to by the kernel and read in user space by reading a DebugFS file (see https://docs.kernel.org/filesystems/relay.html). .. admonition:: Hands-on - expose the ``hello_repeats`` variable from the hello module for read-write access via DebugFS - verify that the file works correctly Fault injection =============== Some kernel functions support fault injection: they can be forced to return an error regardless of whether there was an actual error. The fault injection mechanism can be controlled via DebugFS and provides several parameters that specify when and how the injection should happen, for example depending on the stacktrace (see https://www.kernel.org/doc/html/latest/fault-injection/fault-injection.html). The related kernel config options are: ``CONFIG_FAULT_INJECTION``, ``CONFIG_FAULT_INJECTION_DEBUG_FS``; configs for each failure type, e.g. ``CONFIG_FAILSLAB`` for slab allocation failures, ``CONFIG_FUNCTION_ERROR_INJECTION`` for injecting specific error return values. .. admonition:: Hands-on - inject a single user memory access failure with verbosity set to 2 - examine the kernel log