Class 8: Kernel modules

Date: 15.04.2025

Small task #7

Additional materials

What is a module?

A module is relocatable code / data that can be inserted and removed from the kernel while the system is running. The module may refer to (exported) kernel symbols as if it was compiled as part of the kernel and may itself provide (export) symbols that other modules may use. The module is responsible for a certain service in the kernel -- for example, modules can be device and file system drivers, network filters, cryptographic algorithms, etc.

The modules are compiled for a specific version and configuration of the kernel -- the use of modules from other versions of the kernel (or the same version with significantly different configuration) will probably fail. The module loading system will try to detect and prevent such a situation.

Creating modules

The kernel modules (as well as the main kernel code) are written in C (support for Rust is WIP). The environment inside the kernel, however, is quite specific and differs greatly from writing an ordinary program in the user space.

Kernel code is written according to the official coding style -- https://www.kernel.org/doc/html/v6.12/process/coding-style.html .

Compilation of modules

To compile modules, you need a directory with the configured and compiled kernel source. In principle, only the header files and configuration are enough, but separating the appropriate files from the rest is a very complicated process, and only Linux distributions with a large number of their own scripts are able to do so. The Kbuild system is responsible for compiling the modules (as well as the kernel itself), which is quite a complicated overlay on top of Makefile.

To compile our module, we need to create a Kbuild file describing our code, for example:

obj-m := module.o different_module.o

compiles the module.c file to the module.ko module, and the different_module.c file to the different_module.ko file.

If we want to combine several source files into one module, we can do it as follows:

obj-m := module.o
module-objs := module_p1.o module_p2.o

This Kbuild file will compile the module_p1.c and module_p2.c files and combine them into the module.ko module.

To call the compilation of the module, you should call make in the kernel source directory, pointing it to our directory with external modules:

make -C /usr/src/linux-<version> M=/home/<user>/my_modules

For simplicity, you can write your own Makefile that calls the appropriate command (see example).

Module metadata

Each module can (but does not have to) define metadata using the following macros (defined in linux/module.h):

MODULE_LICENSE("GPL");
MODULE_AUTHOR("Horse Fred");
MODULE_DESCRIPTION("Driver for my device");

Metadata defined in this way is stored (along with many other data) in the .modinfo section of the finished module and can be printed with the modinfo command.

Choosing a license has an important and unobvious effect -- using a GPL-compatible license will allow us to use kernel symbols marked as available only for modules licensed under the GPL. The following are recognized as compatible licenses:

  • "GPL" -- GNU Public License v2 or later,

  • "GPL v2" -- GNU Public License v2,

  • "GPL and additional rights" -- GNU Public License v2 + additional rights

  • "Dual BSD/GPL" -- GNU Public License v2 or BSD license to choose from

  • "Dual MPL/GPL" -- GNU Public License v2 or Mozilla to choose from

  • "Dual MIT/GPL" -- GNU Public License v2 or MIT to choose from

Module constructor and destructor

Modules do not have a main function or their own process / thread (unless they create it themselves, but it is quite rare). Instead, the module's code is called by various kernel subsystems when there is something to do for it.

Each module can define a function initiating the module (constructor) and releasing the module (destructor). These functions are defined as follows:

int my_init_function(void) {
    /* ... */
}
void my_cleanup_function(void) {
    /* ... */
}
module_init(my_init_function);
module_exit(my_cleanup_function);

The init function is called when the module is loaded. If everything went well, it should return 0. If it failed to initialize the module, it should return the error code (negated code from errno*.h) -- the module will be immediately removed by the kernel.

The cleanup function is called when the module is removed (but is not called when the init function has returned an error).

The task of the init function is to register the functionality provided by the module into the kernel structures -- for example, a PCI device driver will in this function inform the PCI subsystem of supported devices and functions that should be triggered when a matching device is detected. Without such registration, the kernel will never call the code of our module, so modules without an initializing function are basically only useful as code libraries for other modules.

The task of the cleanup function is to reverse everything that the init function has done and clean after all the module's activity. If the module has an initiating function, always provide a cleanup function (even if it has to be empty) -- otherwise, the kernel will reason that our module does not support removal and will not allow rmmod to be executed on it.

Sometimes you can find older modules that use functions with the default names of init_module() and cleanup_module(), without declaring them with module_init() and module_exit(). This is not recommended in current kernel versions.

A module should have only one constructor and only one destructor.

The first example module shows the use of printk as well as the constructor and the destructor.

Using external symbols

In modules, you can freely use symbols defined and exported by the main kernel code and by other modules (you can view them in the file /proc/kallsyms).

In order for a symbol of our module to be visible from the outside, it should be exported with the macro EXPORT_SYMBOL:

EXPORT_SYMBOL(my_function);

int my_function(int x) {
    ...
}

There is also a similar macro EXPORT_SYMBOL_GPL, exporting a symbol only for modules under the GPL (or compatible) license.

The depmod program automatically collects information on dependencies between modules resulting from the use of exported symbols and ensures that they are loaded in the correct order.

The second example module shows the export of symbols and the use of exported symbols.

Parameterization of modules

You can declare that a specified variable will contain a parameter that can be changed when the module is loaded. The name of the parameter is the same as the variable name.

When the module is being loaded, the values given by the user (if any) will be inserted in place of the given variables, e.g.,:

insmod module.ko irq=5

stores the value of 5 into the irq variable.

To declare that a variable is to be used as a parameter of the module, use the following macro:

module_param(variable, type, permissions);

Types can be: byte, short, ushort, int, uint, long, ulong, charp , bool, invbool. The charp type is used to pass strings (char *). The invbool type means a bool parameter, which is a negation of the value given by the user.

You can define your own parameter types -- to do that, you must define the functions param_get_XXX, param_set_XXX and param_check_XXX.

permissions means the file permissions that will be given to the parameter in sysfs.

Each parameter should have a description. The description of the parameter can be read along with the description of the entire module using the modinfo program, thanks to which the module carries a description of its use. The description is given by the macro MODULE_PARM_DESC:

MODULE_PARM_DESC(variable, description);

Examples:

int irq = 7;
module_param(irq, int, 0);
MODULE_PARM_DESC(irq, "Irq used for device");

char *path="/sbin/modprobe";
module_param(path, charp, 0);
MODULE_PARM_DESC(path, "Path to modprobe");

Use:

printk(KERN_INFO "Using irq: %d", irq);
printk(KERN_INFO "Will use path: %s", path);

To declare an array of parameters, you must use a different macro:

module_param_array(variable, type, pointer_to_count, permissions);

All fields except pointer_to_count have the same meaning as in module_param(). pointer_to_count contains a pointer to the variable to which the number of elements in the array will be written. If you are not interested in the number of arguments, you can specify NULL, but then you need to recognize whether the argument is present based on its contents, which is not recommended. The maximum number of array elements is determined by the array declaration, e.g., if we declare its size to 4, then the user will be able to pass a maximum of 4 elements. In the description of an array parameter, the maximum number of parameters is normally placed in square brackets.

Example:

int num_paths = 2;
char *paths[4] = {"/bin", "/sbin", NULL , NULL};
module_param_array(paths, charp, &num_paths, 0);
MODULE_PARM_DESC(paths, "Search paths [4]");

Use:

int i;
for (i=0; i<num_paths; ++i)
    printk(KERN_INFO "Path[%d]: %s\n", i, paths[i]);

The third example module shows the use of parameters.

Automatic loading of required modules -- kmod

Kmod is a kernel subsystem that loads modules "on demand". For instance, when there is a call to a service related to the given module.

When a user requests access to a device that is supported by a module that is not loaded, the kernel suspends execution of the program and executes the function request_module() requesting the loading of the appropriate module. This function is provided by kmod and works by executing a program (/sbin/modprobe by default, but this can be changed with /proc) for the requested module.

If module loading on demand is to be used in the module, then include:

#include <linux/kmod.h>

On-demand loading is possible thanks to the function:

int request_module (const char *module_name)

Reference count

Each module has its own reference count -- as long as it is positive, the kernel will not allow the module to be removed. It should be increased when our module is in active use (e.g., it handles an open device or a mounted file system). The management of such a counter is usually done by other kernel subsystems, but you have to help them by passing the pointer to your module (macro THIS_MODULE). For example, for a character device driver, you must fill the owner field of the file_operations structure with this pointer.

Exercise

Hands-on

  • Compile and run the sample modules.

  • Experimentally investigate the maximum size that can be allocated with kmalloc.

  • Make the 4th example work for larger buffers (using vmalloc).

  • Find and explain the security hole in one of the sample codes. Consider the consequences of this type of errors in the kernel code. Trigger a kernel panic.

Small task #7

Take a look at the crypto/md5.c module. Using it as a reference, write and compile your own module computing the Adler-32 control sum.

The module should expose an AF_ALG socket interface like the original. You may use test-adler32.c for testing.

Literature

1. man insmod, rmmod, lsmod, modprobe, depmod, modinfo 1. A. Rubini, J. Corbet, G. Kroah-Hartman, Linux Device Drivers, 3rd edition,

O'Reilly, 2005. (http://lwn.net/Kernel/LDD3/)

  1. Peter Salzman, Ori Pomerantz "The Linux Kernel Module Programming Guide", 2001 - http://www.faqs.org/docs/kernel

  2. http://tldp.org/HOWTO/Module-HOWTO/

  3. http://tldp.org/LDP/lkmpg/2.6/html/index.html

  4. Documentation/kbuild/makefiles.txt, modules.txt