-
Notifications
You must be signed in to change notification settings - Fork 0
API Reference
David Sisco edited this page Jan 6, 2026
·
4 revisions
Quick reference for core types, enums, and configuration options.
| Type | Description |
|---|---|
Lexer |
Level 1 — Stateless character classifier |
TokenParser |
Level 2 — Combines simple tokens into semantic tokens |
Tokenizer |
Combined Level 1 + 2 (legacy, prefer two-level API) |
TokenizerOptions |
Configuration for tokenization |
| Type | Description |
|---|---|
SimpleToken |
Level 1 struct — atomic character classifications |
Token |
Level 2 abstract base — semantic tokens |
IdentToken |
Identifiers |
WhitespaceToken |
Spaces, tabs, newlines |
SymbolToken |
Single-character symbols |
OperatorToken |
Multi-character operators |
NumericToken |
Number literals |
StringToken |
Quoted strings |
TaggedIdentToken |
Prefixed identifiers (#define, @attr) |
CommentToken |
Comments |
BlockToken |
Delimited blocks with children |
ErrorToken |
Parse errors |
| Type | Description |
|---|---|
SyntaxTree |
Main entry point — parse, query, edit |
SyntaxNode |
Abstract navigable node |
SyntaxToken |
Terminal token node |
SyntaxBlock |
Block node with children |
SyntaxList |
Root token list |
SyntaxEditor |
Batch mutations |
TreeWalker |
DOM-style traversal |
Schema |
Unified configuration |
Query |
Static factory for node queries |
Trivia |
Whitespace/comment info |
public enum TokenType
{
Ident,
Whitespace,
Symbol,
Operator,
Numeric,
String,
TaggedIdent,
Comment,
Block,
Error
}Node classification with range-based categories:
| Range | Category | Examples |
|---|---|---|
| 0-99 | Token |
Ident, Numeric, String, Operator
|
| 100-499 | Container |
BraceBlock, ParenBlock, TokenList
|
| 500-1999 | Keyword | Schema-defined (if, while, int) |
| 2000-65535 | Semantic | Schema-defined syntax patterns |
Use kind.IsLeaf(), kind.IsContainer(), kind.IsKeyword(), kind.IsSemantic() to check ranges.
public enum NumericType
{
Integer, // 123
FloatingPoint // 3.14, .5
}public enum BlockTokenType
{
Brace, // { }
Bracket, // [ ]
Paren // ( )
}public enum TriviaKind
{
Whitespace,
Newline,
SingleLineComment,
MultiLineComment
}public enum FilterResult
{
Accept, // Include node in results
Reject, // Exclude node and entire subtree
Skip // Exclude node, still visit children
}[Flags]
public enum NodeFilter
{
All = Leaves | Blocks,
Leaves = 1,
Blocks = 2
}Pre-defined operator sets:
| Set | Contents |
|---|---|
Universal |
==, !=, &&, ||, <=, >=, <<, >>
|
CFamily |
Universal + ++, --, ->, ::, ...
|
JavaScript |
CFamily + ===, !==, =>, ?., ??, **
|
Rust |
CFamily + =>, .., ..=, ::
|
Python |
Universal + **, //, :=, @
|
var options = TokenizerOptions.Default
.WithOperators(CommonOperators.JavaScript);Pre-defined keyword category presets:
| Category | Keywords |
|---|---|
CTypes |
int, float, double, void, char, bool, etc. |
ControlFlow |
if, else, while, for, return, break, etc. |
Modifiers |
public, private, static, const, readonly
|
Values |
true, false, null, this, base
|
var schema = Schema.Create()
.DefineKeywords(CommonKeywords.CTypes)
.DefineKeywordCategory("SqlKeywords", caseSensitive: false, "SELECT", "FROM", "WHERE")
.Build();
// Query keyword info
var info = schema.GetKeywordInfo(leaf.Kind); // Returns KeywordInfo { Category, Keyword, Kind }Pre-defined comment styles:
| Style | Start | End |
|---|---|---|
CStyleSingleLine |
// |
newline |
CStyleMultiLine |
/* |
*/ |
HashSingleLine |
# |
newline |
// Built-in
var options = TokenizerOptions.Default
.WithCommentStyles(CommentStyle.CStyleSingleLine);
// Custom single-line
var lua = new CommentStyle("--");
// Custom multi-line
var html = new CommentStyle("<!--", "-->");| Specifier | Output | Example |
|---|---|---|
G |
Content (default) | foo |
T |
Type name | IdentToken |
P |
Position | 42 |
R |
Range | 42..45 |
D |
Debug | Ident[42..45] |
Console.WriteLine($"{token:D}"); // "Ident[0..3]"// Basic tokenization
IEnumerable<Token> tokens = "source".TokenizeToTokens();
IEnumerable<Token> tokens = "source".TokenizeToTokens(options);// Async tokenization
Task<ImmutableArray<Token>> tokens = stream.TokenizeAsync();
Task<ImmutableArray<Token>> tokens = stream.TokenizeAsync(options);
// Streaming
IAsyncEnumerable<Token> tokens = stream.TokenizeStreamingAsync();
IAsyncEnumerable<Token> tokens = stream.TokenizeStreamingAsync(options);IAsyncEnumerable<Token> tokens = pipeReader.TokenizeStreamingAsync();
IAsyncEnumerable<Token> tokens = pipeReader.TokenizeStreamingAsync(options, encoding);// Parsing
var tree = SyntaxTree.Parse(source);
var tree = SyntaxTree.Parse(source, schema);
// Properties
SyntaxNode root = tree.Root;
IEnumerable<SyntaxToken> leaves = tree.Leaves;
bool hasSchema = tree.HasSchema;
// Querying
IEnumerable<SyntaxNode> nodes = tree.Select(query);
IEnumerable<T> matches = tree.Match<T>(); // Requires schema
// Editing
SyntaxEditor editor = tree.CreateEditor();
tree.Undo();
tree.Redo();
// Schema
SyntaxTree bound = tree.WithSchema(schema);
// Traversal
TreeWalker walker = tree.CreateTreeWalker();See Query API for complete reference.
// Named queries
Query.Ident("main")
Query.Symbol(".")
Query.Operator("=>")
// Any-kind queries
Query.AnyIdent
Query.AnyBlock
Query.ParenBlock
// Combinators
query.First()
query.Where(predicate)
query.FollowedBy(other)
query.Then(other)
Query.Sequence(a, b, c)
// Block boundaries
block.Start() // Opening delimiter
block.End() // Closing delimitereditor.InsertBefore(target, text)
editor.InsertAfter(target, text)
editor.Replace(target, text)
editor.Replace(target, transformer)
editor.Edit(target, contentTransformer)
editor.Remove(target)
editor.Commit()- .NET 8.0 or later
-
CommunityToolkit.HighPerformance(automatically included)
- Maximum file size: ~2GB (positions stored as
int)
- Getting Started — Quick introduction
- Architecture — Design overview
- Configuration — TokenizerOptions details