Skip to content

dsisco11/TinyTokenizer

Repository files navigation

TinyTokenizer

A high-performance syntax tree library for .NET 8+ with fluent queries, pattern matching, and undo/redo editing — built on a zero-allocation tokenizer with SIMD optimization.

Features

  • TinyAst — Red-green syntax tree with fluent queries, editing, and undo/redo
  • Query API — CSS-like selectors with combinators, lookahead, and repetition
  • SyntaxEditor — Batch mutations with atomic commit and full undo/redo support
  • Syntax Nodes — Pattern-based AST matching (function calls, property access, etc.)
  • Schema System — Unified configuration for tokenization + syntax node definitions
  • TreeWalker — DOM-style filtered tree traversal
  • High Performance — Zero-allocation parsing with SIMD-optimized SearchValues<char>
  • Error Recovery — Gracefully handles malformed input and continues parsing

Installation

dotnet add package TinyAst

Quick Start

TinyAst

using TinyTokenizer.Ast;

// Parse source into a syntax tree
var tree = SyntaxTree.Parse("function foo() { return 1; }");

// Query nodes with CSS-like selectors
var idents = tree.Select(Query.AnyIdent);  // [Ident("function"), Ident("foo"), Ident("return")]

// Fluent mutations with undo support
tree.CreateEditor()
    .Replace(Query.Ident("foo"), "bar")
    .Insert(Query.BraceBlock.First().InnerStart(), "console.log('enter');")
    .Commit();

// Undo/redo
tree.Undo();
tree.Redo();

Pattern Matching with Schema

var tree = SyntaxTree.Parse("obj.method(x)", Schema.Default);

var methodCalls = tree.Match<MethodCallSyntax>().ToList();  // [MethodCallSyntax { Object="obj", Method="method" }]

Low-Level Tokenization

For scenarios where you don't need a syntax tree:

using TinyTokenizer;

var tokens = "func(a, b)".TokenizeToTokens();

// tokens contains:
// - IdentToken("func")
// - BlockToken("(a, b)") with children:
//   - IdentToken("a"), SymbolToken(","), WhitespaceToken(" "), IdentToken("b")

📚 Documentation

Full documentation on the Wiki →

TinyAst

Topic Description
TinyAst Guide Syntax tree API
Schema Unified configuration
Query API CSS-like node selectors
SyntaxEditor Batch mutations, undo/redo
Syntax Nodes Pattern-based matching
TreeWalker DOM-style traversal
Trivia Whitespace preservation

Low-Level Tokenization

Topic Description
Token Types SimpleToken vs Token, all types
Configuration Operators, comments, symbols
Async Streaming Stream/PipeReader APIs
Error Handling ErrorToken and recovery

Reference

Topic Description
Getting Started Installation and basic usage
Architecture Two-level design, red-green trees
API Reference Types, enums, methods

Requirements

  • .NET 8.0 or later
  • CommunityToolkit.HighPerformance (automatically included)

License

MIT

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages