A lightweight, education-oriented compiler designed to transform infix arithmetic expressions into stack-based assembly instructions. This project demonstrates the core phases of compilation, from lexical analysis to code generation, with built-in AST optimizations.
- Lexical Analysis (Tokenizer): Converts raw source code into a stream of typed tokens (Identifiers, Numbers, Operators).
- Abstract Syntax Tree (AST): Builds a hierarchical representation of expressions to preserve operator precedence.
- Constant Folding Optimization: Automatically evaluates constant expressions at compile-time (e.g.,
5 * 2becomes10) to improve runtime efficiency. - Stack-based Code Generation: Produces VM-style instructions (
PUSH,LOAD,ADD,STORE) suitable for stack machines.
- GCC/G++ Compiler
Build the compiler using the following command:
g++ main.cpp lexer.cpp parser.cpp codegen.cpp -o compiler- Define your expression in
input.txt(Example:a = b + 5 * 2) - Run the compiler:
./compilerInput (input.txt):
a = b + 5 * 2Output (Instructions):
LOAD b
PUSH 10
ADD
STORE a
Note how 5 * 2 was optimized to 10 via constant folding.
- Lexer: Scans input for valid symbols.
- Parser: Recursively constructs an AST and applies
fold()to simplify nodes. - Generator: Recursively traverses the optimized AST to emit postfix-style instructions.
Developed as a pedagogical tool for understanding compiler construction and optimization.