Task 2: The HardDoom™ Device driver¶
Announced at: 10.04.2018
Due at: 15.05.2018 (final 29.05.2018)
Contents
Extra materials¶
- z2-harddoom-en
- Device emulator: https://github.com/koriakin/qemu (branch
harddoom
) - Test program: https://github.com/koriakin/prboom-plus (branch
doomdev
) - Header file with hardware register definitions (to be copied to solution code): https://github.com/koriakin/qemu/blob/harddoom/hw/misc/harddoom.h
- Header file with character device interface definitions (to be copied to solution code): https://github.com/koriakin/prboom-plus/blob/doomdev/src/doomdev.h
Introduction¶
The task is to write a driver for the HardDoom™ device, which is a graphics accelerator designed for Doom. The device is delivered in the form of a modified version of qemu.
The device should be available to the user in the form of a character device.
For each HardDoom™ device present in the system, create a /dev/doomX
character device, where X is the index of the HardDoom™ device,
starting with 0.
Character device interface¶
The device /dev/doom*
is used only to create HardDoom™ resources – all
the proper operations will be performed on the created resources.
It should support the following operations:
open
: obviously.close
: obviously.ioctl(DOOMDEV_IOCTL_CREATE_SURFACE)
: creates a new frame buffer on the device. As a parameter of this call, the dimensions of the buffer (width and height) are transmitted. The width must be a multiple of 64 in the range 64 … 2048 and the height must be in the range 1 … 2048. The result of this call is a new file descriptor referring to the created buffer. The buffer created has undefined content.ioctl(DOOMDEV_IOCTL_CREATE_TEXTURE)
: creates a new column texture on the device. Parameters of this call are texture size in bytes (maximum 4MiB), texture height in texels (maximum 1023, or 0, if the texture is not to be repeated vertically), and an pointer to the texture data. The result is a file descriptor referring to the texture created.ioctl(DOOMDEV_IOCTL_CREATE_FLAT)
: creates a new flat texture on the device. The parameter for this call is the data pointer (0x1000
bytes). The result is a file descriptor referring to the texture created.ioctl(DOOMDEV_IOCTL_CREATE_COLORMAPS)
: creates a new array of color maps on the device. The parameters of this call are the size of the array (number of color maps) and the pointer to data (each color map is0x100
bytes). The result is a file descriptor referring to the created array. The maximum allowable size for the array is0x100
maps.
Textures and color map arrays do not support any standard operations except
close
(which, if all other references have already been released,
releases their memory) – they can only be used as parameters for drawing
calls. It is also impossible to change their content in any way after creation.
All pointers are passed as uint64_t
so that the structures have the same
layout in 64-bit mode as in 32-bit mode, avoiding the need to define
corresponding _compat
structures. For the same reason, many structures
have unused _pad
fields.
The following operations can be called on the frame buffers:
ioctl (DOOMDEV_SURF_IOCTL_COPY_RECT)
: performs a series ofCOPY_RECT
operations to a given buffer. Parameters are:surf_src_fd
: file descriptor pointing to the frame buffer from which the copy should be made.rects_ptr
: a pointer to an array ofdoomdev_copy_rect
structures.rects_num
: number of rectangles to copy.- in the
doomdev_copy_rect
structures:pos_dst_x
,pos_dst_y
– coordinates of the target rectangle in the given buffer (top left corner).pos_src_x
,pos_src_y
– coordinates of the source rectangle in the source buffer.width
,height
– size of the rectangle to be copied.
The semantics of this call are quite similar to
write
: the driver tries to perform as many operations as possible from the given list, stopping in case of error or signal arrival. If the first operation failed, the error code is returned. Otherwise, the number of completed operations is returned. The user code is responsible for retrying when incomplete.The user is responsible for ensuring that, within one
ioctl
call, no pixel is both written and read (ie, the commandINTERLOCK
between the rectangles will not be required). The driver does not have to detect such situations (but it can if it wants to) – sending commands to the device and obtaining an incorrect drawing result is acceptable in such a situation.ioctl(DOOMDEV_SURF_IOCTL_FILL_RECT)
: performs a series ofFILL_RECT
operations. Parameters:rects_ptr
: a pointer to an array ofdoomdev_fill_rect
structures.rects_num
: number of rectangles to fill.- in the
doomdev_fill_rect
structures:pos_dst_x
,pos_dst_y
– coordinates of the target rectangle in the given buffer.width
,height
– size of the rectangle to be filled.color
– the fill color.
The returned value is as in
DOOMDEV_SURF_IOCTL_COPY_RECT
.ioctl(DOOMDEV_SURF_IOCTL_DRAW_LINE)
: performs a series ofDRAW_LINE
operations. Parameters:lines_ptr
: a pointer to an array ofdoomdev_line
structures.lines_num
: number of lines to draw.- in the
doomdev_line
structures:pos_a_x
,pos_a_y
: coordinates of the first endpoint of the line.pos_b_x
,pos_b_y
: coordinates of the second endpoint.color
– the color of the line to be drawn.
The returned value is as in
DOOMDEV_SURF_IOCTL_COPY_RECT
.ioctl(DOOMDEV_SURF_IOCTL_DRAW_BACKGROUND)
: performs theDRAW_BACKGROUND
operation. Parameters:flat_fd
: a file descriptor pointing to a flat texture that will serve as the background.
In case of a successful call, 0 is returned.
ioctl(DOOMDEV_SURF_IOCTL_DRAW_COLUMNS)
: performs a series ofDRAW_COLUMN
operations. Parameters:draw_flags
: a combination of the following flags:DOOMDEV_DRAW_FLAGS_FUZZ
– if set, the fuzz effect will be rendered – most parameters are ignored (including other flags).DOOMDEV_DRAW_FLAGS_TRANSLATE
– if set, the palette will be remapped according to the translation color map.DOOMDEV_DRAW_FLAGS_COLORMAP
– if set, colors will be dimmed according to the color map.
texture_fd
: a descriptor of the column texture (ignored if theFUZZ
flag is set).translation_fd
: a descriptor of the color map array used by theTRANSLATE
flag (ignored, if the flag is not set).colormap_fd
: a descriptor of the color map array used by theCOLORMAP
andFUZZ
flags. Ignored, if none of these flags is set.translation_idx
: index of the color map used by theTRANSLATE
option. Used only, if theTRANSLATE
flag is set.columns_num
: number of columns to draw.columns_ptr
: a pointer to an array ofdoomdev_column
structures:column_offset
: starting offset of this column in the texture.ustart
: an unsigned fixed-point 16.16 number, must be in the range supported by the hardware. Ignored, if theFUZZ
flag is used.ustep
: an unsigned fixed-point 16.16 number, must be in the range supported by the hardware. Ignored, if theFUZZ
flag is used.x
: thex
coordinate of the column.y1
,y2
: they
coordinates of the top and bottom pixels of the column.colormap_idx
: index of the color map used byFUZZ
andCOLORMAP
flags. Ignored, if neither of those is set.
The returned value is as in
DOOMDEV_SURF_IOCTL_COPY_RECT
.ioctl(DOOMDEV_SURF_IOCTL_DRAW_SPANS)
: performs a series ofDRAW_SPAN
operations. Parameters:flat_fd
: a flat texture descriptor.translation_fd
: like above.colormap_fd
: like above.draw_flags
: like above, but without theFUZZ
flag.translation_idx
: like above.spans_num
: number of spans to draw.spans_ptr
a pointer to an array ofdoomdev_span
structures:ustart
,vstart
: likeustart
above.ustep
,vstep
: likeustep
above.x1
,x2
: thex
coordinates of the leftmost and rightmost pixel of the span.y
: they
coordinate of the span.colormap_idx
: like above.
The returned value is as in
DOOMDEV_SURF_IOCTL_COPY_RECT
.lseek
: sets the position in the buffer for subsequentread
calls.read
,pread
,readv
, etc: waits for completion of all previously submitted drawing operations for the given buffer, and then reads the finished data from the buffer to the user space. In case of an attempt to read outside of buffer bounds, end-of-file should be returned.
The driver should detect commands with incorrect parameters (wrong file type passed as
*_fd
, coordinates extending beyond the frame buffer, etc.) and return the error
EINVAL
. If the user tries to create textures or frame buffers larger than those
supported by the hardware, EOVERFLOW
should be returned.
The driver should register its devices in sysfs so that udev automatically
creates device files with appropriate names in /dev
. The major and minor
numbers for these devices are arbitrary (majors should be allocated dynamically).
A header file with the appropriate definitions can be found here: https://github.com/koriakin/prboom-plus/blob/doomdev/src/doomdev.h
The driver can assume a limit of 256 devices in the system.
Assumptions for interaction with hardware¶
It can be assumed that before the driver is loaded, the device has a state like a hardware reset. The device should also be left in this state when the driver is unloaded.
A fully-scored solution should work asynchronously – drawing ioctl
operations should send commands to the device and return to the user space
without waiting for completion (but if the command buffers are already full,
it is acceptable to wait for free space to become available). Waiting for
the end of the command should only be done when calling read
which will
actually need the drawing results.
Scoring¶
You can get up to 10 points for the task. The score is a sum of three parts:
- full use of the device (from 0 to 2 points):
- fully asynchronous operation (
ioctl
does not wait for completion of sent commands, starting to send commands byioctl
does not wait for the commands sent earlier to be finished,read
does not require stopping the whole device): 1p - using the command fetch block: 1p
- fully asynchronous operation (
- test result (from 0 to 8 points)
- evaluation of the solution code (from 0 to -10 points)
Solution format¶
The driver should be implemented as a Linux kernel module in version 4.9.13.
The module containing the driver should be named harddoom.ko
.
As the solution, you should deliver an archive containing:
- module sources
- Makefile and Kbuild files that can build the module
- a brief description of the solution
QEMU¶
A modified version of qemu, available in the source version, is required to use the HardDoom™ device.
To compile a modified version of qemu:
Clone the https://github.com/koriakin/qemu repository
git checkout harddoom
Ensure that dependencies are installed:
ncurses
,libsdl
,curl
, and in some distributions alsoncurses-dev
,libsdl-dev
,curl-dev
(package names may vary slightly depending on the distribution)Run
./configure
with options as you like (see./configure --help
). The official binary was compiled with:--target-list=i386-softmmu,x86_64-softmmu --python=$(which python2) --audio-drv-list=alsa,pa
Run
make
Install by executing
make install
, or run directly (the binary isx86_64-softmmu/qemu-system-x86_64
).
To run the modified qemu with the HardDoom™ device, give it the -device harddoom
option.
Passing this option several times will cause emulation of several instances of the device.
To add a HardDoom™ device live to a working qemu:
- go to monitor mode in qemu (Ctrl+Alt+2 in the qemu window)
- enter
device_add harddoom
- go back to the regular screen by Ctrl-Alt-1
- enter
echo 1 >/sys/bus/pci/rescan
for linux to notice
To simulate removing the device:
echo 1 > /sys/bus/pci/devices/0000:<idurządzenia>/remove
Tests¶
To test the driver we have prepared a modified version of prboom-plus
, which is
a modernized version of the Doom game engine. To start it:
- install the following packages in the image:
libsdl2-dev
libsdl2-mixer-dev
libsdl2-image-dev
libsdl2-net-dev
xfce4
[or other graphic environment]xserver-xorg
autoconf
- download sources from the https://github.com/koriakin/prboom-plus repository
- choose branch
doomdev
- compile sources (without installation, the program will not be able to find its data file,
prboom-plus.wad
):./bootstrap
./configure --prefix=$HOME
make
make install
- download the game data file. You can use any of the following files:
freedoom1.wad
orfreedoom2.wad
from the Freedoom project (https://freedoom.github.io/) – a Doom clone available under a free license.doom.wad
ordoom2.wad
from the full version of the original game, if you bought one.doom1.wad
from the shareware version of the original game.
- load our driver and make sure we have access to
/dev/doom0
- start X11, and in it the game:
$HOME/bin/prboom-plus -iwad <gyno.wad>
- in the Options -> General -> Video mode menu, select the “doomdev” option (the default “8bit” setting selects software rendering in a very similar mode to our device and can be used to compare the results).
In order for the sound to work in the game, you should pass the -soundhw hda
option to qemu and turn on the appropriate driver when compiling the kernel
(Device Drivers -> Sound card support -> HD-Audio).
Hints¶
For files for frame buffers, textures, etc., we recommend using the
anon_inode_getfile
function. Unfortunately, such files do not allow
lseek
, pread
, etc. by default – to fix this, set the flags
FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE
in the f_mode
field.
To get the file
structure from a file descriptor, you can use fdget
and fdput
. To see if the structure passed to us is the right type,
just compare its file_operations
pointer with ours.
We recommend starting the implementation from the FILL_COLOR
and
DRAW_LINE
operations (they only require a frame buffer and allow
you to see the map). Then we recommend DRAW_COLUMN
(you can omit
flags and color maps in the beginning) – it is responsible for drawing
most of the graphics in the game and you will not see much without it.
The device requires that the texture size be a multiple of 256 bytes. If you create a texture size that is not supported directly by the hardware, align the size up and pad the data from the user with zeros.
The size of the texture or frame buffer is rarely exactly a multiple of a page – we can use this by placing the page table in the unused portion of the last page. This will avoid a separate allocation for a (usually very small) page table.
It is not necessary to use the FENCE
command and related registers for
the solution in the partially-scored synchronous version – just PING_SYNC
.
The solution in the full asynchronous version will need to use FENCE
(in conjunction with the FENCE_WAIT
register or with the PING_SYNC
command to wait in read
).
It may happen that we will not be able to send a command due to the lack of
space in the queue of commands (the one built into the device or our own
in the memory pointed to by CMD_*_PTR
). To effectively implement the
waiting for free space, we recommend:
- send with some minimum frequency (eg every 1/8 .. 1/2 the size of our
command buffer or hardware queue) the command
PING_ASYNC
- set the
PONG_ASYNC
interrupt to disabled by default - if there is a lack of space in the queue:
- reset the interrupt
PONG_ASYNC
inINTR
- check if there is still no place in the queue (protection from race) – if it is, immediately return to send
- enable the interrupt
PONG_ASYNC
inINTR_ENABLE
- wait for an interrupt
- disable the
PONG_ASYNC
interrupt inINTR_ENABLE
- reset the interrupt
In order for the device to work efficiently, you should avoid unnecessary sending
commands that enforce serialization (mainly between commands from a single ioctl
call):
* _PT
,*_ADDR
: clears the cache and blocks the batching of adjacent columns by the microcode (in particular, if two adjacent columns have the samecolormap_idx
, do not send theCOLORMAP_ADDR
command between them).SURF_DIMS
,TEXTURE_DIMS
,DRAW_PARAMS
,FENCE
,PING_SYNC
: block the batching of columns (but they are basically free between batches).INTERLOCK
: blocks parallel processing ofCOPY_RECT
commands (if you want to read from the buffer you last drew before sending the lastINTERLOCK
command, there is no need to send another – in particular, for a series ofCOPY_RECT
calls between different frame buffers, do not sendINTERLOCK
between calls.)
You do not need to worry about redundant FILL_COLOR
, XY_*
, *START
, *STEP
,
PING_ASYNC
commands – it will not take more than a device to process them than
for the driver to check redundancy.
If you want to temporarily (for testing) change something in the game (eg to comment
out operations that are not supported yet), the code responsible for the device
operation can be found in src/i_doomdev.c
.
If you break the configuration of prboom-plus
so that it stops running
sufficiently to change the settings, you can find its configuration file in
$HOME/.prboom-plus/prboom-plus.cfg
. Deleting it will restore the default settings.
To see various the operations in action:
DRAW_COLUMN
without any flags: used to draw interface graphics (menu, etc.).DRAW_COLUMN
withCOLORMAP
: used to draw all walls and most objects.DRAW_COLUMN
withTRANSLATE
: used to draw the new HUD (available under the F5 button – perhaps after pressing several times). If the pallete swap works, some of the digits should not be red.DRAW_COLUMN
withFUZZ
: enter theidbeholdi
code to make yourself invisible (simply by pressing consecutive letters during the game).DRAW_SPAN
: used for floor / ceiling drawing.DRAW_LINE
andFILL_COLOR
: used to draw the map (Tab button).COPY_RECT
: used to draw the transition effect between game states (starting a new game or completing the level).DRAW_BACKGROUND
: reduce the screen size from full (pressing the-
button several times) – the screen border should be filled.