A hardware-based lossless text compression and decompression system implemented in Verilog, targeting the SkyWater 130nm (sky130) PDK. The design encodes and decodes English text using a predefined Huffman tree optimized for English character frequency distribution.
Huffman Encoding assigns shorter binary codewords to more frequent characters and longer codewords to rarer ones, achieving efficient lossless compression. This project implements the full encode-decode pipeline as synthesizable RTL, verified through a file-driven Verilog testbench, and targeted for ASIC implementation using an open-source EDA flow.
The system is built around a single dual-mode RTL module controlled by the SEL signal.
| SEL | Mode | Input | Output |
|---|---|---|---|
| 0 | Encoder | 8-bit ASCII char | 16-bit Huffman code + length |
| 1 | Decoder | 16-bit encoded bits | 8-bit recovered ASCII char |
Each entry in the encoding file is a 28-bit word:
[27:24] — Code length (4 bits)
[23:8] — Huffman code, left-justified (16 bits)
[7:0] — ASCII character value (8 bits)
- Testbench reads one ASCII character at a time from
textfile - Testbench scans the
encodingfile entry by entry, presenting each 28-bit entry to the DUT - DUT compares
input_asciiagainstencoding[7:0]; on match, outputs the Huffman code and length - Testbench writes the Huffman bits to
outputfile.txt - Process repeats until ETX (
0x03) is encountered
- Testbench reads the compressed bitstream from
outputfile.txt, assembling 16-bit chunks - DUT accumulates bits in a 32-bit internal buffer
- Testbench scans the
encodingfile; DUT compares the top bits of its buffer against each entry - On match, DUT outputs the recovered ASCII character and removes the consumed bits from the buffer
- Testbench writes the recovered character to
outputfile2.txt
| Stage | Tool |
|---|---|
| Simulation | Icarus Verilog (iverilog) |
| Waveform Viewing | GTKWave |
| Synthesis | Yosys + sky130 PDK |
| Place and Route | OpenROAD |
| Physical Verification | Magic |
| PDK | SkyWater 130nm (sky130) |