Implements a Lexer, a LL(1) Recursive Descent Parser, a rudimentary Pretty Printer (to visualize AST nodes) and a module to generate LLVM Intermediate Representation (IR). The language definitions are based on Robert Nystrom's book Crafting Interpreters.
$ mdkir build
$ cmake -DLLVM_DIR=path_to_llvm_build/lib/cmake/llvm -B build/
$ cd build
$ cmake --build .
var input = 5;
var result = 1;
while(input > 1) {
result = result * input;
input = input - 1;
}
define void @main() {
entry:
%input = alloca i32, align 4
store i32 5, ptr %input, align 4
%res = alloca i32, align 4
store i32 1, ptr %res, align 4
%tmp = alloca i32, align 4
store i32 1, ptr %tmp, align 4
br label %loop_holder
loop_exit: ; preds = %loop_holder
ret void
block: ; preds = %loop_holder
%0 = load i32, ptr %res, align 4
%1 = load i32, ptr %input, align 4
%2 = mul i32 %0, %1
store i32 %2, ptr %res, align 4
%3 = load i32, ptr %input, align 4
%4 = sub i32 %3, 1
store i32 %4, ptr %input, align 4
ret void
loop_holder: ; preds = %entry
%5 = load i32, ptr %input, align 4
%6 = icmp ugt i32 %5, 1
br i1 %6, label %block, label %loop_exit
}
The LLVM IR generated above can be used to generate assembly/object files using LLVM's llc for a number of target architectures supported by the LLVM (eg. x86, RISC-V, Arm, etc).
-
Program -> Declaration* EOF
-
Declaration -> VarDecl | FuncDecl | ClassDecl | Statement
-
Statement -> ExprStmt | AssignStmt | ForStmt | IfStmt | PrintStmt | ReturnStmt | WhileStmtl | Block
-
ExprStmt -> Literal | Identifier | Unary | Binary | Grouping
-
AssignStmt -> Identifier EQUAL ExprStmt SEMICOLON
-
ForStmt -> FOR OPEN_PAREN VarDecl SEMICOLON ExprStmt SEMICOLON ExprStmt CLOSE_PAREN Block
-
IfStmt -> IF OPEN_PAREN ExprStmt CLOSE_PAREN Block | IF OPEN_PAREN ExprStmt CLOSE_PAREN Block ELSE Block
-
PrintStmt -> PRINT OPEN_PAREN ExprStmt CLOSE_PAREN SEMICOLON
-
ReturnStmt -> RETURN SEMICOLON | RETURN ExprStmt SEMICOLON
-
WhileStmt -> WHILE OPEN_PAREN ExprStmt CLOSE_PAREN Block
-
Block -> OPEN_CURLY Statement* CLOSE_CURLY
-
Literal -> NUMBER | STRING | "true" | "false" | "nil" ;
-
Grouping -> "(" ExprStmt ")" ;
-
Unary -> ( "-" | "!" ) ExprStmt ;
-
Binary -> ExprStmt Operator ExprStmt ;
-
Operator -> "==" | "!=" | "<" | "<=" | ">" | ">=" | "+" | "-" | "*" | "/" ;
-
VarDecl -> VAR Identifier SEMICOLON | VAR Identifier EQUAL ExprStmt SEMICOLON
-
FuncDecl -> FUN Function
-
ClassDecl -> CLASS Identifier OPEN_CURLY Function* CLOSE_CURLY SEMICOLON
-
Function -> Identifier ( ) Block; | Identifier ( Parameters ) Block
-
Parameters -> Identifier | Identifier , Identifier+
-
VAR -> "var"
-
PRINT -> "print"
-
CLASS -> "class"
-
RETURN -> "return"