Class 8: Kernel modules¶
Date: 15.04.2025
Additional materials¶
examples.tar
-- example modules
What is a module?¶
A module is relocatable code / data that can be inserted and removed from the kernel while the system is running. The module may refer to (exported) kernel symbols as if it was compiled as part of the kernel and may itself provide (export) symbols that other modules may use. The module is responsible for a certain service in the kernel -- for example, modules can be device and file system drivers, network filters, cryptographic algorithms, etc.
The modules are compiled for a specific version and configuration of the kernel -- the use of modules from other versions of the kernel (or the same version with significantly different configuration) will probably fail. The module loading system will try to detect and prevent such a situation.
Creating modules¶
The kernel modules (as well as the main kernel code) are written in C (support for Rust is WIP). The environment inside the kernel, however, is quite specific and differs greatly from writing an ordinary program in the user space.
Kernel code is written according to the official coding style -- https://www.kernel.org/doc/html/v6.12/process/coding-style.html .
Compilation of modules¶
To compile modules, you need a directory with the configured and compiled kernel source. In principle, only the header files and configuration are enough, but separating the appropriate files from the rest is a very complicated process, and only Linux distributions with a large number of their own scripts are able to do so. The Kbuild system is responsible for compiling the modules (as well as the kernel itself), which is quite a complicated overlay on top of Makefile.
To compile our module, we need to create a Kbuild
file describing our code, for example:
obj-m := module.o different_module.o
compiles the module.c
file to the module.ko
module, and
the different_module.c
file to the different_module.ko
file.
If we want to combine several source files into one module, we can do it as follows:
obj-m := module.o
module-objs := module_p1.o module_p2.o
This Kbuild file will compile the module_p1.c
and module_p2.c
files
and combine them into the module.ko
module.
To call the compilation of the module, you should call make
in the kernel
source directory, pointing it to our directory with external modules:
make -C /usr/src/linux-<version> M=/home/<user>/my_modules
For simplicity, you can write your own Makefile
that calls the appropriate
command (see example).
Module metadata¶
Each module can (but does not have to) define metadata using the following
macros (defined in linux/module.h
):
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Horse Fred");
MODULE_DESCRIPTION("Driver for my device");
Metadata defined in this way is stored (along with many other data) in the
.modinfo
section of the finished module and can be printed with the
modinfo
command.
Choosing a license has an important and unobvious effect -- using a GPL-compatible license will allow us to use kernel symbols marked as available only for modules licensed under the GPL. The following are recognized as compatible licenses:
"GPL"
-- GNU Public License v2 or later,"GPL v2"
-- GNU Public License v2,"GPL and additional rights"
-- GNU Public License v2 + additional rights"Dual BSD/GPL"
-- GNU Public License v2 or BSD license to choose from"Dual MPL/GPL"
-- GNU Public License v2 or Mozilla to choose from"Dual MIT/GPL"
-- GNU Public License v2 or MIT to choose from
Module constructor and destructor¶
Modules do not have a main
function or their own process / thread (unless
they create it themselves, but it is quite rare). Instead, the module's code
is called by various kernel subsystems when there is something to do for it.
Each module can define a function initiating the module (constructor) and releasing the module (destructor). These functions are defined as follows:
int my_init_function(void) {
/* ... */
}
void my_cleanup_function(void) {
/* ... */
}
module_init(my_init_function);
module_exit(my_cleanup_function);
The init function is called when the module is loaded. If everything went well,
it should return 0. If it failed to initialize the module, it should return
the error code (negated code from errno*.h
) -- the module will be immediately
removed by the kernel.
The cleanup function is called when the module is removed (but is not called when the init function has returned an error).
The task of the init function is to register the functionality provided by the module into the kernel structures -- for example, a PCI device driver will in this function inform the PCI subsystem of supported devices and functions that should be triggered when a matching device is detected. Without such registration, the kernel will never call the code of our module, so modules without an initializing function are basically only useful as code libraries for other modules.
The task of the cleanup function is to reverse everything that the init
function has done and clean after all the module's activity. If the module
has an initiating function, always provide a cleanup function (even if it
has to be empty) -- otherwise, the kernel will reason that our module
does not support removal and will not allow rmmod
to be executed on it.
Sometimes you can find older modules that use functions with the default names
of init_module()
and cleanup_module()
, without declaring them with
module_init()
and module_exit()
. This is not recommended in current
kernel versions.
A module should have only one constructor and only one destructor.
The first example module shows the use of printk
as well as the constructor
and the destructor.
Using external symbols¶
In modules, you can freely use symbols defined and exported by the main kernel
code and by other modules (you can view them in the file /proc/kallsyms
).
In order for a symbol of our module to be visible from the outside, it should
be exported with the macro EXPORT_SYMBOL
:
EXPORT_SYMBOL(my_function);
int my_function(int x) {
...
}
There is also a similar macro EXPORT_SYMBOL_GPL
, exporting a symbol only
for modules under the GPL (or compatible) license.
The depmod
program automatically collects information on dependencies
between modules resulting from the use of exported symbols and ensures that
they are loaded in the correct order.
The second example module shows the export of symbols and the use of exported symbols.
Parameterization of modules¶
You can declare that a specified variable will contain a parameter that can be changed when the module is loaded. The name of the parameter is the same as the variable name.
When the module is being loaded, the values given by the user (if any) will be inserted in place of the given variables, e.g.,:
insmod module.ko irq=5
stores the value of 5 into the irq
variable.
To declare that a variable is to be used as a parameter of the module, use the following macro:
module_param(variable, type, permissions);
Types can be: byte
, short
, ushort
, int
, uint
, long
,
ulong
, charp
, bool
, invbool
. The charp
type is used to
pass strings (char *
). The invbool
type means a bool
parameter,
which is a negation of the value given by the user.
You can define your own parameter types -- to do that, you must define
the functions param_get_XXX
, param_set_XXX
and param_check_XXX
.
permissions
means the file permissions that will be given to the parameter
in sysfs
.
Each parameter should have a description. The description of the parameter can
be read along with the description of the entire module using the modinfo
program, thanks to which the module carries a description of its use.
The description is given by the macro MODULE_PARM_DESC
:
MODULE_PARM_DESC(variable, description);
Examples:
int irq = 7;
module_param(irq, int, 0);
MODULE_PARM_DESC(irq, "Irq used for device");
char *path="/sbin/modprobe";
module_param(path, charp, 0);
MODULE_PARM_DESC(path, "Path to modprobe");
Use:
printk(KERN_INFO "Using irq: %d", irq);
printk(KERN_INFO "Will use path: %s", path);
To declare an array of parameters, you must use a different macro:
module_param_array(variable, type, pointer_to_count, permissions);
All fields except pointer_to_count
have the same meaning as in
module_param()
. pointer_to_count
contains a pointer to the variable
to which the number of elements in the array will be written. If you are
not interested in the number of arguments, you can specify NULL
, but
then you need to recognize whether the argument is present based on its
contents, which is not recommended. The maximum number of array elements
is determined by the array declaration, e.g., if we declare its size to 4,
then the user will be able to pass a maximum of 4 elements.
In the description of an array parameter, the maximum number of parameters
is normally placed in square brackets.
Example:
int num_paths = 2;
char *paths[4] = {"/bin", "/sbin", NULL , NULL};
module_param_array(paths, charp, &num_paths, 0);
MODULE_PARM_DESC(paths, "Search paths [4]");
Use:
int i;
for (i=0; i<num_paths; ++i)
printk(KERN_INFO "Path[%d]: %s\n", i, paths[i]);
The third example module shows the use of parameters.
Automatic loading of required modules -- kmod¶
Kmod is a kernel subsystem that loads modules "on demand". For instance, when there is a call to a service related to the given module.
When a user requests access to a device that is supported by a module that is
not loaded, the kernel suspends execution of the program and executes the
function request_module()
requesting the loading of the appropriate module.
This function is provided by kmod and works by executing a program
(/sbin/modprobe
by default, but this can be changed with /proc
) for
the requested module.
If module loading on demand is to be used in the module, then include:
#include <linux/kmod.h>
On-demand loading is possible thanks to the function:
int request_module (const char *module_name)
Reference count¶
Each module has its own reference count -- as long as it is positive, the kernel
will not allow the module to be removed. It should be increased when our module
is in active use (e.g., it handles an open device or a mounted file system).
The management of such a counter is usually done by other kernel subsystems,
but you have to help them by passing the pointer to your module (macro
THIS_MODULE
). For example, for a character device driver, you must fill
the owner
field of the file_operations
structure with this pointer.
Exercise¶
Hands-on
Compile and run the sample modules.
Experimentally investigate the maximum size that can be allocated with
kmalloc
.Make the 4th example work for larger buffers (using
vmalloc
).Find and explain the security hole in one of the sample codes. Consider the consequences of this type of errors in the kernel code. Trigger a kernel panic.
Small task #7¶
Take a look at the crypto/md5.c
module.
Using it as a reference, write and compile your own module computing
the Adler-32 control sum.
The module should expose an AF_ALG
socket interface like the original.
You may use test-adler32.c
for testing.
Literature¶
1. man insmod
, rmmod
, lsmod
, modprobe
, depmod
, modinfo
1. A. Rubini, J. Corbet, G. Kroah-Hartman, Linux Device Drivers, 3rd edition,
O'Reilly, 2005. (http://lwn.net/Kernel/LDD3/)
Peter Salzman, Ori Pomerantz "The Linux Kernel Module Programming Guide", 2001 - http://www.faqs.org/docs/kernel
Documentation/kbuild/makefiles.txt
,modules.txt