[FEATURE] 158-language support via universal tree-sitter grammar loader

## Problem

CodeLens currently supports ~10 languages with hand-written parsers per language. Each new language requires a new parser file. This doesn't scale, and agents working on Go, Rust (beyond current support), Java, C++, etc. get no structural analysis.

## Proposed Approach: Universal Grammar Loader

tree-sitter has grammars for 150+ languages available as npm packages. Instead of writing per-language parsers, write one **generic extraction layer** that works on any tree-sitter grammar:

```python
UNIVERSAL_NODE_TYPES = {
    # Maps tree-sitter node type names -> CodeLens concept
    'function_definition': 'Function',
    'function_declaration': 'Function',
    'method_definition': 'Method',
    'class_declaration': 'Class',
    'class_definition': 'Class',
    'import_statement': 'Import',
    'call_expression': 'CallSite',
    # ... etc
}
```

Many tree-sitter grammars use similar node type names — a universal mapper covers 80% of languages with ~100 lines of code. Language-specific overrides handle the remaining 20%.

## Implementation Steps

1. Write `universal_parser.py` that loads any `tree-sitter-{lang}` grammar dynamically
2. Define the node type mapping table
3. Add language -> grammar package mapping (e.g. `go` -> `tree-sitter-go`)
4. Install grammars on-demand via npm/pip in `setup.sh`
5. Language-specific override files for languages where naming differs (Ruby, Haskell, etc.)

## Custom Kernel Idea (long-term)

For maximum performance, compile all grammars into a single shared Python extension (.so file) using `tree-sitter`'s Language.build_library(). This eliminates per-process grammar loading overhead and could be distributed as a wheel.

This is essentially what codebase-memory-mcp does in C — vendoring all 158 grammars into one binary. The Python equivalent is a compiled .so with all languages baked in.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] 158-language support via universal tree-sitter grammar loader #18

Problem

Proposed Approach: Universal Grammar Loader

Implementation Steps

Custom Kernel Idea (long-term)

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

[FEATURE] 158-language support via universal tree-sitter grammar loader #18

Description

Problem

Proposed Approach: Universal Grammar Loader

Implementation Steps

Custom Kernel Idea (long-term)

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions