This project implements a bit-level file compression and decompression system using Huffman coding in Java. Huffman coding is a lossless compression algorithm that assigns shorter binary codes to more frequent bytes and longer codes to less frequent ones. This implementation has achieved up to 60% size reduction on test files.
- Huffman tree construction based on input byte frequency
- Efficient compression using binary encoding of characters
- Accurate decompression by reconstructing the original tree
- Custom bit-level input/output classes for compact storage
- Read the input file and calculate the frequency of each byte.
- Construct a Huffman tree using a priority queue.
- Generate a map of characters to their corresponding binary codes.
- Write a header that stores the Huffman tree structure.
- Encode the input data using the generated codes and write it as a stream of bits.
- Read the file header and reconstruct the Huffman tree.
- Use the tree to decode the compressed bit stream.
- Write the decoded bytes to an output file.
HuffmanCompressor/
│
├── HuffmanCompressor.java // Compression logic
├── HuffmanDecompressor.java // Decompression logic
├── HuffmanNode.java // Tree node structure
├── BitInputStream.java // Bit-level file input
├── BitOutputStream.java // Bit-level file output
└── README.md
javac *.javajava HuffmanCompressor input.txt compressed.huffjava HuffmanDecompressor compressed.huff output.txt- Achieved up to 60% reduction in file size depending on content redundancy
- Effective on plain text and other compressible file types
- Java SE 8 or higher
- No external libraries required
This project is licensed under the MIT License.