Skip to content

fix(cpp-extractor): capture default-value params, templated declarations, and nested-namespace method names#438

Open
tirth8205 wants to merge 2 commits into
Egonex-AI:mainfrom
tirth8205:fix/cpp-extractor-edge-cases
Open

fix(cpp-extractor): capture default-value params, templated declarations, and nested-namespace method names#438
tirth8205 wants to merge 2 commits into
Egonex-AI:mainfrom
tirth8205:fix/cpp-extractor-edge-cases

Conversation

@tirth8205

Copy link
Copy Markdown
Contributor

Problem

Three edge-case bugs caused the C++ tree-sitter extractor to silently drop real code constructs:

  1. Default-valued parameters dropped. extractParams only iterated parameter_declaration nodes. A parameter with a default value (e.g. int b = 10) is parsed by tree-sitter-cpp as an optional_parameter_declaration, so it was never captured. int add(int a, int b = 10) yielded params ["a"], and void f(int a = 1, int b = 2) yielded [].

  2. Templated functions and classes dropped. walkTopLevel had no template_declaration case. Any templated free function or templated class/struct is wrapped in a template_declaration node, so all of them vanished from functions/classes/exports. template <typename T> T identity(T x) {} produced no function; template <typename T> class Box { ... }; produced no class.

  3. Wrong method name/qualifier for nested-namespace definitions. extractFuncDeclName read only the top name field and the first namespace_identifier for a qualified_identifier. For a deeply scoped definition like void net::Server::start() the qualified_identifiers nest, so the name became Server::start and the qualifier became net — polluting the functions list and registering the method under the wrong class.

Fix

packages/core/src/plugins/extractors/cpp-extractor.ts:

  1. extractParams now iterates all params in source order, handling both parameter_declaration and optional_parameter_declaration.
  2. Added a template_declaration case to walkTopLevel that dispatches the inner function_definition / class_specifier / struct_specifier through the existing handlers.
  3. extractFuncDeclName now walks down the name field of nested qualified_identifiers to the leaf identifier, tracking the immediately-enclosing scope as the qualifier — yielding name start / qualifier Server for void net::Server::start().

(Also dropped the now-unused findChildren import.)

Testing

Added five edge-case tests in cpp-extractor.test.ts (default-value param, all-default params, templated free function, templated class, nested-namespace out-of-class method). All five fail before the fix and pass after. The full core suite (697 tests) is green, and tsc --noEmit + eslint on the changed files both pass clean.

🤖 Generated with Claude Code

…ons, and nested-namespace method names

Three edge-case bugs in the C++ tree-sitter extractor:

1. extractParams only iterated parameter_declaration nodes, so any
   parameter with a default value (parsed as optional_parameter_declaration)
   was silently dropped. Now both node types are handled in source order.

2. walkTopLevel had no template_declaration case, so all templated free
   functions and templated classes/structs were dropped. Added a case that
   dispatches the inner declaration through the existing handlers.

3. extractFuncDeclName mis-parsed nested qualified_identifiers: for
   void net::Server::start() it produced name 'Server::start' and qualifier
   'net'. Now walks down the name field to the leaf identifier, tracking the
   immediately-enclosing scope, yielding name 'start' / qualifier 'Server'.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

@thejesh23 thejesh23 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few concerns before this lands. (Surfacing because the same LanguageExtractor pattern is in flight for Dart in #435 — the optional-param / generics-in-qualifier shape repeats there, so worth aligning.)

1. Call-graph caller name is a silent behavior change. extractCallGraphextractFunctionNameextractFuncDeclName. For void net::Server::start() { foo(); } the caller string was "Server::start" and is now "start". Anything keying call-graph edges on the prior form (dashboard nodes, the linker, snapshot tests in other packages) will see edges rename. Please grep before merge, or call this out as intentional.

2. The "templated declarations" fix is partial. walkTopLevel's new template_declaration arm only handles one nesting level + three inner shapes. These still drop silently:

  • Nested template_declaration (out-of-class member templates: template<class T> template<class U> void Box<T>::set(U)).
  • In-class member templates — extractClassOrStruct has no template_declaration arm at all, so class Foo { template<class T> void bar(T); }; loses bar.

At minimum, a TODO comment so the next reader doesn't think these are covered.

3. Templated qualifier mismatch. For void Box<int>::get() the scope text is "Box<int>", so methodsByClass.get("Box<int>") misses the class registered as "Box". Pre-existing, but the nested-walk loop makes it more reachable. A qualifier.replace(/<.*>$/, "") would fix the common case; otherwise file a follow-up.

4. Test gaps.

  • The net::Server::start test is effectively two-level — a regression that only descends one level would still pass it. A three-level case (void a::b::Server::foo()) would lock the loop.
  • The fixture puts class Server at file scope and then defines net::Server::start — syntactically legal but doesn't exercise namespace recursion + qualifier resolution together. A namespace net { class Server {...}; } void net::Server::start(){} fixture would.
  • No direct assertion on qualifier; the current "no function named Server::start" is a negative existence check.

5. Variadic params. variadic_parameter_declaration (e.g. void f(Ts... xs)) isn't in the new accept-list. Probably fine if unwrapDeclaratorName reaches the identifier through the pack-expansion declarator, but worth a one-line test to confirm — and same question applies to my Dart PR for [positional] / {named: default} shapes once cpp lands.

Nit: stale JSDoc on extractFuncDeclName (lines 28–38) — still describes the simple case only.

…plated qualifiers

Addresses review on PR Egonex-AI#438:
- Variadic parameter packs (`Ts... xs`) are now captured via
  variadic_parameter_declaration in extractParams.
- Templated qualifiers (`Box<T>::get`) strip the template_type arguments so
  the method resolves against the bare class registered as "Box".
- In-class member templates (`class Foo { template<class T> void bar(T); };`)
  are captured (both declaration-only and inline) via a template_declaration
  arm in the class member loop.
- Out-of-class member templates with nested template_declaration
  (`template<class T> template<class U> void Box<T>::set(U)`) are captured by
  recursing through nested template_declaration nodes.
- Documented the intentional leaf-name choice for qualified call-graph caller
  names and refreshed the extractFuncDeclName JSDoc.

Adds tests locking each behavior plus a combined namespace+qualifier fixture,
a three-segment qualifier walk, and a positive class-method association
assertion.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@tirth8205

Copy link
Copy Markdown
Contributor Author

All five points plus the nit are addressed in 63cba76. Rather than stub TODOs, I verified the real tree-sitter-cpp node shapes with a throwaway probe test, then implemented against the observed types.

1 — Call-graph caller name. Confirmed this only changes names with >=3 segments: for single-level Server::start the old name-field text was already start. The change is internally consistent — extractStructure's function-node name and extractCallGraph's caller both go through the same extractFuncDeclName, so both ends of a calls edge moved together. Grepped core for snapshot fixtures / cross-package call-graph consumers and found none. Kept the leaf-name behavior and added a comment documenting it as intentional, plus refreshed the extractFuncDeclName JSDoc.

2 — Member templates. Implemented, not deferred. In-class member templates (class Foo { template<class T> void bar(T); };) now have a template_declaration arm in the class member loop handling both declaration-only and inline-body forms. Out-of-class member templates with nested template_declaration (template<class T> template<class U> void Box<T>::set(U)) are handled by recursing through nested template_declaration nodes in walkTopLevel. Tests added for both.

3 — Templated qualifier. Fixed. The scope field of Box<int>::get is a template_type node, so I read its type_identifier child (Box) instead of the full text. This avoids the greedy /<.*>$/ regex (which mishandles nested angle brackets) and resolves the method against the class registered as Box. Test added.

4 — Test gaps. Added a namespace net { class Server {...}; } void net::Server::start(){} fixture that exercises namespace recursion + qualifier resolution together with a positive assertion that the Server class gains method start, a three-segment a::b::Server::foo case to lock the walk, and explicit negative checks that no qualified name leaks into functions.

5 — Variadic params. Confirmed variadic_parameter_declaration was being skipped by the accept-list guard. Added it; the name flows through variadic_declarator -> identifier via the existing unwrapDeclaratorName fallback. Test added asserting void f(int a, Ts... xs) yields ["a", "xs"].

Nit — JSDoc. Updated to describe the deep-nesting leaf/immediate-scope behavior and the templated-qualifier stripping.

Full core suite green (703 passed); tsc --noEmit clean.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants