An educational compiler implementation featuring a festive, Christmas-themed programming language with complete front-end analysis and partial MIPS assembly code generation.
- Overview
- Features
- Language Syntax
- Architecture
- Technologies
- Installation
- Usage
- Project Structure
- Examples
- Limitations
- Future Work
- Authors
- License
This project implements a complete compiler pipeline from source code to MIPS assembly language. Developed as part of a Compilers course (Winter 2024/2025), it showcases the fundamental phases of compilation with a unique twist: all language keywords are Christmas-themed in Spanish!
Key Accomplishments:
- โ Full lexical analysis with custom tokenization
- โ Complete syntactic analysis with error recovery
- โ Semantic analysis with symbol tables and type checking
- โ Partial MIPS assembly code generation
- โ Tested on MARS/SPIM simulators
- Tool: JFlex lexer generator
- Features:
- Tokenization of Christmas-themed keywords
- Support for integers, floats, booleans, characters, and strings
- Single-line and multi-line comment recognition
- Comprehensive error reporting with line and column numbers
- Tool: CUP (Constructor of Useful Parsers)
- Features:
- Context-free grammar for the Christmas language
- Robust error recovery mechanisms
- Support for nested structures
- Detailed syntax error reporting
- Features:
- Symbol Table Management: Multi-scope symbol tables with function and global scopes
- Type Checking: Validation of type compatibility in expressions and assignments
- Scope Validation: Proper handling of variable declarations and shadowing
- Type Inference: Basic type resolution for expressions
- Function Validation: Parameter count and type verification for function calls
-
Implemented:
- Arithmetic operations (
+,-,*,/,%,^) - Control flow structures (if-else, while loops, for loops)
- Function declarations with prologue/epilogue
- Stack frame management
- Register allocation for temporary values
- Basic I/O operations (print statements)
- Arithmetic operations (
-
Tested On: MARS (MIPS Assembler and Runtime Simulator) and SPIM
The Christmas language uses festive Spanish keywords. Here's a quick reference:
| Keyword | Standard Equivalent | Type |
|---|---|---|
rodolfo |
int |
Integer |
bromista |
float |
Float |
trueno |
boolean |
Boolean |
cupido |
char |
Character |
cometa |
string |
String |
| Keyword | Standard Equivalent |
|---|---|
elfo |
if |
hada |
else |
envuelve |
while |
duende |
for |
varios |
switch |
historia |
case |
ultimo |
default |
| Keyword | Operation |
|---|---|
navidad |
Addition (+) |
intercambio |
Subtraction (-) |
nochebuena |
Multiplication (*) |
reyes |
Division (/) |
magos |
Modulus (%) |
adviento |
Power (^) |
quien |
Increment (++) |
grinch |
Decrement (--) |
| Keyword | Operation |
|---|---|
melchor |
AND (&&) |
gaspar |
OR (||) |
baltazar |
NOT (!) |
mary |
Equals (==) |
openslae |
Not equals (!=) |
snowball |
Less than (<) |
evergreen |
Less than or equal (<=) |
minstix |
Greater than (>) |
upatree |
Greater than or equal (>=) |
| Keyword | Meaning |
|---|---|
abrecuento |
Open block ({ ) |
cierracuento |
Close block (}) |
abreregalo |
Open parenthesis (() |
cierraregalo |
Close parenthesis ()) |
abreempaque |
Open bracket ([) |
cierraempaque |
Close bracket (]) |
finregalo |
Semicolon (;) |
entrega |
Assignment (=) |
| Keyword | Operation |
|---|---|
narra |
|
escucha |
Read |
| Keyword | Meaning |
|---|---|
envia |
Return |
corta |
Break |
sigue |
Colon (:) |
The compiler follows a traditional multi-phase architecture:
โโโโโโโโโโโโโโโโโโโ
โ Source Code โ
โ (.txt file) โ
โโโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Lexical Analysisโ โโโ JFlex (minijava.jflex)
โ (Lexer) โ
โโโโโโโโโโฌโโโโโโโโโ
โ Token Stream
โผ
โโโโโโโโโโโโโโโโโโโ
โSyntactic Analysisโ โโโ CUP (parser.cup)
โ (Parser) โ
โโโโโโโโโโฌโโโโโโโโโ
โ Abstract Syntax Tree
โผ
โโโโโโโโโโโโโโโโโโโ
โ Semantic Analysisโ
โ - Symbol Table โ
โ - Type Checkingโ
โโโโโโโโโโฌโโโโโโโโโ
โ Annotated AST
โผ
โโโโโโโโโโโโโโโโโโโ
โ Code Generator โ
โ (MIPS ASM) โ
โโโโโโโโโโฌโโโโโโโโโ
โ
โผ
โโโโโโโโโโโโโโโโโโโ
โ Output (.asm) โ
โโโโโโโโโโโโโโโโโโโ
- Pattern matching for tokens
- Whitespace and comment handling
- Token creation with position tracking
- Grammar rules definition
- Error recovery strategies
- AST construction
- Integration with semantic analyzer
- Scope management (global and function-level)
- Variable and function information storage
- Type information tracking
- Type checking for expressions and assignments
- Function call validation
- Variable declaration and usage verification
- Scope resolution
- MIPS instruction emission
- Register allocation
- Stack frame management
- Label generation for control flow
- Language: Java
- Build Tool: Gradle
- Lexer Generator: JFlex 1.8.2
- Parser Generator: CUP (Java Cup)
- Target Architecture: MIPS32
- Testing: MARS, SPIM simulators
- Java Development Kit (JDK) 11 or higher
- Gradle (included via wrapper)
- MARS or SPIM simulator (for running generated assembly)
- Clone the repository:
git clone <repository-url>
cd Christmas-Compiler/Programa- Build the project:
# On Windows
gradlew.bat build
# On Linux/Mac
./gradlew build- Generate Lexer and Parser (if needed):
# Run the generator
./gradlew run-
Create a source file with Christmas language syntax (e.g.,
program.txt) -
Run the compiler:
// In your Java code
Main compiler = new Main();
compiler.test("path/to/program.txt");- Output: The compiler generates an
.asmfile with the same base name as the input
# Input: test01.txt
# Output: test01.asm
# Run the generated assembly in MARS:
# 1. Open MARS simulator
# 2. Load test01.asm
# 3. Assemble the program
# 4. Run itChristmas-Compiler/
โโโ README.md
โโโ info.txt
โโโ Documentacion/
โโโ Programa/
โโโ build.gradle
โโโ gradlew
โโโ gradlew.bat
โโโ settings.gradle
โโโ gradle/
โโโ src/
โโโ lex/
โ โโโ minijava.jflex # Lexer specification
โ โโโ parser.cup # Parser grammar
โ โโโ salida.txt
โโโ main/java/
โ โโโ compiler/
โ โ โโโ Generator.java # Lexer/Parser generator
โ โ โโโ Main.java # Entry point
โ โ โโโ Tester.java # Testing utilities
โ โโโ destCodeGenerator/
โ โ โโโ CodeGenerator.java # MIPS code generation
โ โ โโโ Operations.java
โ โโโ organizer/
โ โ โโโ Organize.java
โ โโโ parser/
โ โ โโโ Lexer.java # Generated lexer
โ โ โโโ parser.java # Generated parser
โ โ โโโ sym.java # Token symbols
โ โโโ semanticalAnalysis/
โ โ โโโ ControlStructureOperations.java
โ โ โโโ Function.java
โ โ โโโ Variable.java
โ โโโ tables/
โ โโโ FunctionInfo.java
โ โโโ SymbolInfo.java
โ โโโ SymbolTable.java
โ โโโ TokenInfo.java
โโโ tests/
โโโ test01.txt # Test programs
โโโ test01.asm # Generated assembly
โโโ testIF.txt
โโโ ...
Christmas Language:
rodolfo _suma_ abreregalo rodolfo _a_, rodolfo _b_ cierraregalo abrecuento
rodolfo _resultado_ entrega _a_ navidad _b_ finregalo
envia _resultado_ finregalo
cierracuento
rodolfo _verano_ abrecuento
rodolfo _x_ entrega 3 finregalo
rodolfo _y_ entrega 5 finregalo
rodolfo _total_ entrega _suma_ abreregalo _x_, _y_ cierraregalo finregalo
narra abreregalo "La suma es: ", _total_ cierraregalo finregalo
cierracuento
Equivalent in Standard Syntax:
int sum(int a, int b) {
int result = a + b;
return result;
}
int main() {
int x = 3;
int y = 5;
int total = sum(x, y);
print("La suma es: ", total);
}Christmas Language:
rodolfo _verano_ abrecuento
rodolfo _x_ entrega 10 finregalo
elfo abreregalo _x_ minstix 5 cierraregalo abrecuento
narra abreregalo "x es mayor que 5" cierraregalo finregalo
cierracuento
hada abrecuento
narra abreregalo "x es menor o igual a 5" cierraregalo finregalo
cierracuento
envuelve abreregalo _x_ minstix 0 cierraregalo abrecuento
narra abreregalo _x_ cierraregalo finregalo
_x_ grinch finregalo
cierracuento
cierracuento
Equivalent in Standard Syntax:
int main() {
int x = 10;
if (x > 5) {
print("x es mayor que 5");
} else {
print("x es menor o igual a 5");
}
while (x > 0) {
print(x);
x--;
}
}Christmas Language:
rodolfo _verano_ abrecuento
rodolfo _arr_ abreempaque 5 cierraempaque entrega 0 finregalo
_arr_ abreempaque 0 cierraempaque entrega 10 finregalo
_arr_ abreempaque 1 cierraempaque entrega 20 finregalo
narra abreregalo "Primer elemento: ", _arr_ abreempaque 0 cierraempaque cierraregalo finregalo
cierracuento
Due to course time constraints, the MIPS code generator was partially implemented. The following features are NOT fully supported in the back-end:
- Advanced array operations: Dynamic allocation, multi-dimensional arrays
- String operations: String concatenation, manipulation
- Float arithmetic: Limited floating-point support
- Switch statements: Code generation for switch-case structures
- Nested function calls: Complex call chains
- Memory management: No heap allocation
- Standard library: No built-in functions beyond basic I/O
- Optimization: No code optimization passes
- Basic arithmetic operations (int)
- Simple control flow (if-else, while, for)
- Function calls with integer parameters
- Basic print statements
- Variable assignments
- Simple expressions
Note: The front-end (lexical, syntactic, and semantic analysis) is fully functional and correctly validates all language constructs. Only the code generation phase has limitations.
Potential enhancements for this project:
-
Complete MIPS Code Generation:
- Full array support
- Complete floating-point operations
- Switch statement implementation
-
Optimizations:
- Constant folding
- Dead code elimination
- Register allocation optimization
-
Extended Features:
- Structs/Records
- Pointers
- Dynamic memory allocation
- Standard library functions
-
Development Tools:
- Interactive debugger
- Visual AST representation
- IDE plugin with syntax highlighting
-
Testing:
- Comprehensive test suite
- Performance benchmarks
- Fuzzing testing
Adriรกn Josรฉ Villalobos Peraza & Isaac Ramรญrez Rojas
- Course: Compiladores e Intรฉrpretes
- Term: Winter 2024/2025
- Institution: Instituto Tecnolรณgico de Costa Rica
This project was developed for educational purposes as part of a university course.
๐ Made with festive spirit and lots of coffee โ