Skip to content

semaphoric775/QoiDecoder

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

QOI Decoder

AXI-Stream compressed input → AXI-Stream RGBA pixel output, fully pipelined.

Interface

Signal Width Direction Description
clk 1 in Clock
rst 1 in Synchronous reset (active high)
s_axis_tdata 40 in Packed input bytes (up to 5 per beat)
s_axis_tkeep 5 in Byte-enable, MSB-aligned
s_axis_tvalid 1 in
s_axis_tready 1 out
s_axis_tlast 1 in Asserted on last beat of compressed frame
m_axis_tdata 32 out RGBA pixel (R at [31:24])
m_axis_tvalid 1 out
m_axis_tready 1 in
m_axis_tlast 1 out Asserted on last pixel of frame
m_cfg_width 16 out Frame width in pixels (from QOI header)
m_cfg_height 16 out Frame height in pixels (from QOI header)
m_cfg_channels 8 out Channel count from QOI header
m_cfg_colorspace 8 out Colorspace field from QOI header

Parameters

Parameter Default Description
CHANNELS 4 3 = RGB (alpha forced to 255), 4 = RGBA
FIFO_DEPTH 64 Internal byte FIFO depth

AI Usage Disclaimer

Claude Code and Codex have been used in this project. Though I designed the architecture, testplan, formal assertions, and synthesis strategy, AI has been used for scripting tasks, CocoTB tests, Makefiles, and documentation.

The QOI repository does feature a reference implementation, which this tests against lending some additional confidence that this implementation works.

Additionally, I want to acknowledge the work of Alex Forenich's TAXI project, which provides the underlying AXI infrastructure and excellent examples of modern testbenches.

Implementation

The decoder is a 3-stage pipeline:

  1. Byte unpacker (qoi_unpacker) — strips the 14-byte header, buffers the byte stream, and emits 5-byte aligned beats into an internal FIFO. Parses width, height, channels, and colorspace into sideband outputs.
  2. Chunk decoder (qoi_chunk_dec) — reads from the FIFO and identifies chunk boundaries (RUN, INDEX, DIFF, LUMA, RGB, RGBA). Emits a chunk descriptor (type, data, run length) per chunk.
  3. Pixel reconstructor (qoi_pixel_rec) — maintains prev_pixel and a 64-entry distributed-RAM LUT. Expands RUN repeats, applies DIFF/LUMA arithmetic, looks up INDEX chunks. LUT writes are 2-cycle pipelined: cycle 1 captures the emitted pixel, cycle 2 computes the hash from the registered value.

Post-Route Resource Summary

Target: Artix-7 xc7a35t-CPG236-1, out-of-context

Resource Used Available Util%
Slice LUTs 1051 20,800 5.1%
— LUT as Logic 1039 20,800 5.0%
— LUT as Distributed RAM 12 9,600 0.1%
Slice Registers (FF) 2450 41,600 5.9%
RAMB18 0 100 0%
DSP48 1 90 1.1%
F7 Muxes 257 16,300 1.6%
F8 Muxes 128 8,150 1.6%

Post-Route Timing Summary

Parameter Value
Target clock 120 MHz (8.333 ns)
WNS (worst negative slack) +1.153 ns
Timing constraints met Yes
Failing setup endpoints 0 / 7020
Failing hold endpoints 0 / 7020

Power (post-route estimate, 120 MHz)

Power (W)
Total on-chip 0.084
Dynamic 0.016
— Clocks 0.007
— Logic + signals 0.008
Static 0.068

License

MIT License. See LICENSE for the full text.

The QOI format specification and reference implementation are by Dominic Szablewski (@phoboslab), also MIT licensed.

About

Fully synthesizeable Quite OK Image Compression Decoder

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors