Acceldev Device

The Acceldev device is attached to the computer via the PCI bus. You will find the necessary information in acceldev.h.

The device does not have memory of its own and uses the main memory of the computer with Direct Memory Access (DMA). To overcome memory fragmentation, it uses virtual addresses and page tables in its specific format.

The device supports ACCELDEV_MAX_CONTEXTS (255) independent contexts, which do not share memory. These contexts should be tied to the driver contexts.

The device is designed to allow user commands to run without additional kernel mode validation, while ensuring that users cannot access system memory or other contexts on the device. If user commands are invalid or trigger an error, the device marks the context as errored and raises an interrupt to notify the driver.

Ensuring fair compute time across contexts is not guaranteed; one context may occupy the device, preventing others from running. Dealing with this issue is outside the scope of this assignment.

The device is controlled using MMIO registers. It has only one BAR (BAR0), used for these registers, and uses a single PCI interrupt line.

The MMIO area is 64 KiB in size, but only some of this range is used for registers. All documented registers are 32-bit, little-endian format, and should be accessed only through aligned 32-bit reads and writes.

Buffers and Paging

Data and commands used by the device are stored in paged buffers. Each context can have ACCELDEV_NUM_BUFFERS (16) buffers bound. These are configured using ACCELDEV_CONTEXTS_CONFIGS with an array of acceldev_context_on_device_config structures and the ACCELDEV_DEVICE_CMD_TYPE_BIND_SLOT device command.

Except for ACCELDEV_CONTEXTS_CONFIGS, which uses 64-bit contiguous memory, all buffers and page tables use 40-bit physical addresses, 22-bit virtual addresses in the buffer, and pages of size ACCELDEV_PAGE_SIZE (4 KiB). The page tables are single-level.

  • The kernel passes a 64-bit physical address of the buffer's page table to the device.

  • Bits 12–21 of the virtual address select the page table entry, which contains the physical address of the page.

  • Bits 0–11 of the virtual address represent the offset within the page.

Page tables are 4 KiB in size and contain 1024 entries, each being a 32-bit little-endian word. Each page table entry has the following format:

  • Bit 0: PRESENT — if set, the entry is valid. If not set, using the entry raises a MEM_ERROR.

  • Bits 4–31: PA — bits 12–39 of the page's physical address. Bits 0–11 are always zero; pages must be aligned.

Sending Commands

The device supports two types of commands:

  • Device commands, sent and validated by the driver.

  • User commands (also called context commands), sent by running a code buffer via the ACCELDEV_DEVICE_CMD_TYPE_RUN command.

Device Commands

Device commands consist of ACCELDEV_DEVICE_CMD_WORDS (5) 32-bit little-endian words. They are sent via the CMD_MANUAL_FEED registers:

  • BAR0 + 0x008c: CMD_MANUAL_FREE Read-only register. Shows how many full commands may be queued before a FEED_ERROR occurs. The queue holds CMDS_BUFFER_SIZE (255) commands. Assume the queue is empty after a device reset.

  • BAR0 + 0x008c BAR0 + 0x009c: CMD_MANUAL_FEED Five write-only registers for writing command words. Writing the last (4th counting from 0) word at BAR0 + 0x009c submits the command. Submitting when the queue is full (CMD_MANUAL_FREE == 0) raises a FEED_ERROR interrupt.

NOP Command

Does nothing. Can be used to fill the queue if you feel like it.

  • 0th word: header - Command type: 0x0

The other words are unused. To submit the command you only need to write the 0th and 4th words.

FENCE Command

Signals that all commands submitted before it have been processed.

  • 0th word: header - Command type: 0x3

  • 1st word: 32-bit value VAL

Behavior:

  1. Waits for completion of all previous commands.

  2. Sets CMD_FENCE_LAST to VAL.

  3. If VAL == CMD_FENCE_WAIT, triggers FENCE_WAIT interrupt.

Registers:

  • BAR0 + 0x00a0: CMD_FENCE_LAST 32-bit read/write register. Set to VAL while processing FENCE.

  • BAR0 + 0x00a4: CMD_FENCE_WAIT 32-bit read/write register. Used to schedule FENCE_WAIT interrupt.

RUN command

Schedules user commands on a context.

  • 0th word: header - Bits 0–3: command type 0x1 - Bits 4–31: context ID

  • 1st–2nd words: lower/upper 32 bits of code buffer page table address

  • 3rd word: offset (in bytes) of the first command

  • 4th word: size (in bytes) of commands to process (n_commands * ACCELDEV_USER_CMD_WORDS * sizeof(uint32_t))

BIND_SLOT Command

Binds or unbinds a data buffer to a slot for a given context. Binding and unbiding buffers can also be realized using ACCELDEV_CONTEXTS_CONFIGS but that method is unsafe to execute while the device may use the bound buffers. Therefore, BIND_SLOT is preferred when the context is already running.

  • 0th word: header - Bits 0–3: command type 0x2 - Bits 4–31: context ID

  • 1st word: slot number

  • 2nd–3rd words: lower/upper 32 bits of data buffer page table address

Unbiding a buffer is done by replacing it with a new buffer or submitting 0 as the buffer page table address.

User Commands

User commands are ACCELDEV_USER_CMD_WORDS 32-bit aligned little-endian words in a DMA buffer. Always write the full number of words, even if the command is shorter.

Supported commands:

  • NOP (0x0)

  • FENCE (0x1)

  • FILL (0x2)

FENCE Command

  1. Waits for previous user commands to finish.

  2. Increments fence_counter in the context config.

  3. Triggers USER_FENCE_WAIT interrupt.

  • 0th word: header - Command type: 0x1

FILL Command

Fills part of a buffer with a value.

  • 0th word: header - Command type: 0x2

  • 1st word: 32-bit value

  • 2nd word: buffer slot

  • 3rd word: start offset (bytes)

  • 4th word: length (bytes)

A real accelerator would support more interesting commands but as the goal here is system programming, those will suffice. If you are interested in this topic, refer to Extras – ONNX Runtime.

Control Registers

BAR0 + 0x0008: ENABLE Controls whether the device processes commands. If 0, then the commands are not processed.

BAR0 + 0x000c 0x0010: CONTEXTS_CONFIGS Attaches the contexts' configuration memory. The configuration is stored in contiguous DMA memory containing an array of acceldev_context_on_device_config configs for each of the ACCELDEV_CONTEXTS_CONFIGS contexts.

  • BAR0 + 0x000c: lower 32 bits of address

  • BAR0 + 0x0010: upper 32 bits of address

Interrupts

The device uses six internal interrupts, all multiplexed into one PCI interrupt.

  • FENCE_WAIT — completion of a FENCE command

  • FEED_ERROR — command queue full

  • CMD_ERROR — invalid device command

  • MEM_ERROR — invalid memory access

  • SLOT_ERROR — request for inactive slot

  • USER_FENCE_WAIT — user FENCE command triggered

An interrupt becomes active on event occurrence and inactive when cleared by writing 1 to its bit in the INTR register.

Independently, each of the above interrupts can also be either enabled or disabled at any given time. The driver can set an enabled subset of interrupts by writing an appropriate mask to the INTR_ENABLE register. The device will signal an interrupt on its PCI interrupt line if and only if there is an interrupt that is both enabled and active.

BAR0 + 0x0000: INTR Interrupt status register. Each bit corresponds to an interrupt. Reading returns 1 for active, 0 for inactive. Writing resets (sets to inactive) all interrupts for which 1s were written.

BAR0 + 0x0004: INTR_ENABLE Interrupt enable register. Same bit layout as INTR. A bit value of 1 enables the interrupt. On reset, this is set to 0. Upon machine reset, the register is set to 0, blocking the device from signaling a PCI interrupt until the driver is loaded.

Starting the Device

To start the device, follow these steps:

  1. Clear INTR by writing all 1s.

  2. Enable required interrupts via INTR_ENABLE.

  3. Attach context configs in ACCELDEV_CONTEXTS_CONFIGS.

  4. Enable all device blocks via ENABLE.

  5. Optionally set CMD_FENCE_LAST and CMD_FENCE_WAIT.

To shut down, write 0 to both ENABLE and INTR_ENABLE.

If the device reports an error, reset it by repeating the start-up procedure.