Skip to content

Protocol-aware cross-repo intelligence: gRPC + typed-message + attribute-route matching (design proposal) #292

@sponger94

Description

@sponger94

Status (2026-04-26): Tier 1a (consumer-side gRPC IDL Routes + HANDLES binding) is open as #293 against this issue. Tier 1b (producer-side typed-client GRPC_CALLS emission) and Tiers 2–4 will follow as separate PRs sequenced per §5 below.


Protocol-Aware Cross-Repo Intelligence

Status: Design proposal for upstream codebase-memory-mcp
Audience: cbm maintainer + reviewers
Scope: Extends pass_cross_repo from literal-string matching to protocol-aware matching across the four cross-language patterns that account for >95% of inter-service communication in modern codebases.
Compatibility: Strictly additive. No breaking changes to existing tools, edges, or APIs. Builds on PR #281 (rich get_architecture fields).


TL;DR

cbm already has the scaffolding for cross-repo intelligence (pass_cross_repo.c, CROSS_HTTP_CALLS / CROSS_ASYNC_CALLS / CROSS_GRPC_CALLS edge types, named-route matching). The current implementation only fires when a call site has a literal URL or topic string as its first argument. That covers idiomatic Python/Node code well, but misses the dominant pattern in modern strongly-typed stacks (Java/Spring, .NET, Kotlin, Go-with-codegen): typed clients and message handlers where the routing identifier is a generic type parameter, an interface ancestor, an attribute, or a config-resolved name — never a literal string at the call site.

This proposal adds four protocol-aware extraction tiers, each language-generic, behind a YAML-driven service-pattern registry so adding new frameworks is a config edit rather than a C patch. Tier 1 (gRPC .proto matching) is proposed first as a ~300-LOC working PR to validate the architecture; Tiers 2–4 follow as separate PRs.

Cross-language framework coverage matrix at the end. Acceptance gating: each tier ships independently, success measured by precision/recall against multi-language test fixtures.


1. Background

1.1 What works in cbm today

After PR #281 lands, get_architecture(aspects=["all"]) returns rich structural data (entry_points, routes, hotspots, layers, boundaries, languages). The per-repo extraction pipeline detects:

  • Library identifiers in resolved qualified names (service_patterns.c:631 — 252 patterns across HTTP, async, gRPC, config, route-registration kinds, covering Python/Node/Go/Java/Rust/PHP/Ruby/C# basics)
  • Literal URL / topic strings at call sites (pass_calls.c:emit_http_async_edge)
  • Route registration via app.get("/x", ...) and attribute-routed framework styles (pass_route_nodes.c)
  • Cross-repo matching when both sides have a literal-route identifier (pass_cross_repo.c:cbm_cross_repo_match)

The cross-repo-intelligence mode in index_repository matches __route__<METHOD>__<path> keys across project DBs and emits CROSS_HTTP_CALLS / CROSS_ASYNC_CALLS / CROSS_GRPC_CALLS edges.

1.2 What doesn't work

emit_http_async_edge (pass_calls.c, line ~232):

bool is_url = (url_or_topic && url_or_topic[0] != '\0' &&
               (url_or_topic[0] == '/' || strstr(url_or_topic, "://") != NULL));
bool is_topic = (url_or_topic && ... && svc == CBM_SVC_ASYNC && ...);
if (!is_url && !is_topic) {
    /* fall back to plain CALLS edge */
    return;
}

If the first string argument is not a literal URL or topic, the edge falls through to CALLS — generic, unrouted, untaggable for cross-repo matching. Idiomatic code in major modern frameworks rarely passes a literal URL or topic at the call site:

// .NET / MassTransit — no topic string, message type is the identifier
await _publishEndpoint.Publish(new VoucherRedeemed(voucher.Id), ct);

// .NET / generated gRPC client — no URL string, service.method is the identifier
var resp = await _promoCodeClient.GetVoucherAsync(req, cancellationToken: ct);

// Java / Spring Cloud Stream — no topic string, message type is the identifier
streamBridge.send("output", new OrderShipped(order));

// Java / Feign — interface annotation IS the route, no literal at call site
return feignClient.getVoucher(id);

// Kotlin / Retrofit — same shape as Feign
return retrofitApi.getVoucher(id)

// Go with gRPC codegen — generated client method, no string at call site
resp, err := promoClient.GetVoucher(ctx, req)

// Python / FastAPI typed httpx client (oapi-codegen-derived) — same shape
resp = await client.get_voucher(id=id)

In each case, the producer-side identifier (message type FQN, gRPC service.method, Feign interface annotation) is statically present and resolvable — but not as a literal string argument. It lives in: a generic type parameter, the constructor type of an argument, a class-level attribute, or a method-level attribute.

The consumer side has the same identifier visible in a different syntactic position: an IConsumer<T> declaration, a *Base implementation, a @StreamListener<T> annotation, an attribute-routed controller method.

The matching problem is solvable. The producer/consumer identifier exists statically on both sides. cbm's current extractor just doesn't extract it.

1.3 Why this matters now

Cross-repo intelligence is one of the most-asked-for capabilities in code-graph tools. Industry tooling that does parts of this:

Tool Approach Limitation
Backstage (Spotify) Service catalog from OpenAPI / AsyncAPI / .proto files Manual catalog maintenance, declarative not derived
Sourcegraph Cross-repo references via SCIP indexes Per-symbol, doesn't protocol-match (only name-matches)
Apollo Studio Federated GraphQL via @key directives GraphQL only
AsyncAPI tooling Typed async-message matching AsyncAPI-spec only, requires explicit AsyncAPI files
GitHub CodeQL Cross-repo dataflow for security Security-focused, heavyweight
stack-graphs (GitHub) Universal name resolution graph Within-repo only

cbm is uniquely positioned: single binary, AST + LSP-grade extraction, sub-second incremental indexing, no external service dependencies. The cross-repo capability matters because it's the missing 20% of value that turns "smart code search" into "service-architecture truth source."


2. Cross-language pattern audit

The producer→consumer routing problem decomposes into four tiers. Each tier is generic across major language ecosystems. Concrete framework instances per tier:

Tier 1 — IDL-driven typed stubs (gRPC, GraphQL, OpenAPI, AsyncAPI)

The stable identifier lives in an IDL file shared between producer and consumer repos. Both sides reference generated types derived from the same IDL.

Ecosystem Producer pattern Consumer pattern Stable identifier
gRPC (Go, Java, Python, Rust, TS, C#, Kotlin, Swift) *Client from .proto codegen *Base / *Servicer impl from .proto codegen service.method from .proto
GraphQL Federation (any GraphQL stack) typed query/mutation client resolver bound to type with @key directive type + key from .graphql
OpenAPI (NSwag/openapi-generator/oapi-codegen) generated typed client per language controller/handler matching path+method path + method from openapi.yaml
AsyncAPI generated publisher generated subscriber channel + message from asyncapi.yaml

Detection: parse the IDL file, extract canonical IDs as routes; on producer side find references to generated client types; on consumer side find generated base-class implementations.

Genericity: 100%. gRPC alone covers 8+ languages. .proto/.graphql/.openapi files are language-agnostic by design.

Tier 2 — Typed message pub/sub (interface-ancestor + generic-type)

The stable identifier is a message type's fully-qualified name. Producer has Publish<T> / Send<T> / equivalent on a known interface; consumer implements IConsumer<T> / @MessageHandler / equivalent.

Language Frameworks Producer shape Consumer shape
C# MassTransit, NServiceBus, Wolverine, Brighter, Rebus IPublishEndpoint.Publish<T> / ISendEndpoint.Send<T> / IBus.Publish<T> IConsumer<T> / IHandleMessages<T>
Java/Kotlin Spring Cloud Stream, Axon, Eventuate streamBridge.send(...) / @CommandHandler @StreamListener<T> / @EventHandler
Node/TS NestJS microservices, Moleculer, EventBus libs @MessagePattern<T> emit @EventPattern<T> handler
Python Faust, Celery (typed), aio-pika typed wrappers @app.agent send typed handler funcs
Go Watermill, NATS-typed, Wire typed publish via marshalers typed subscriber registration
Rust Lapin + serde, async-nats with typed deserialization typed publish typed subscribe

Detection: pattern-match the producer interface (e.g., IPublishEndpoint, streamBridge) with its Publish<T> / Send<T> method, extract T from the generic param or the constructor argument's type. On consumer side, find classes implementing IConsumer<T> / IHandleMessages<T> / classes with @StreamListener<T> on a method, extract T. Match by FQN.

Genericity: highly cross-language. ~6 framework families, identical abstract pattern.

Tier 3 — Attribute / decorator-driven HTTP routes

Producer is a typed HTTP client whose interface methods carry route attributes; consumer is a controller/handler with matching route attributes. Both attribute values are literal strings — the easiest tier if extracted from the attribute, not the call site.

Language Producer Consumer
C# Refit, RestEase ([Get("/x")] on interface) ASP.NET Core ([HttpGet("/x")] controller)
Java Feign (@RequestLine("GET /x")), Retrofit (@GET("/x")) Spring (@GetMapping("/x")), JAX-RS (@GET @Path("/x"))
Kotlin Retrofit Spring, Ktor route DSL
TypeScript tsoa, NestJS HttpService with openapi-derived clients NestJS (@Get("/x")), Hono, Express decorators
Python httpx-codegen, aiohttp wrappers from openapi FastAPI (@app.get("/x")), Litestar
Go huma generated, oapi-codegen clients huma, chi, gin, echo route registration
Rust utoipa generated actix-web, axum, rocket route attributes

Detection: extract HTTP method + path from class-level / method-level attributes on both interfaces (producer) and concrete classes (consumer). Match.

Genericity: most universal — decorator-driven HTTP routing is the modern default in every serious web ecosystem.

Tier 4 — Config-resolved service discovery

The producer's call site has only a relative path or named-client reference; the actual base URL lives in a config file (appsettings.json, application.yaml, env vars, Kubernetes Service DNS, service-registry config). Consumer side uses Tier 3 attribute-driven detection.

Ecosystem Producer Config source
C# IHttpClientFactory.CreateClient("name") appsettings*.json, services.AddHttpClient(...)
Spring @FeignClient(name="x", url="${promo.url}") application.yaml, env
Spring Cloud / Eureka / Consul service registry lookups registry config
Kubernetes Service DNS (http://promocode-service:80/x) Service / Ingress YAML
Node env-driven base URLs in axios/fetch wrappers .env, Helm values
Go viper-loaded named services YAML / env

Detection: scan config files for named-service → base-URL mappings; trace CreateClient("name") / @FeignClient("name") to resolved URL; combine with the variable URL path within the calling method to reconstruct the full route.

Genericity: universal microservice pattern.

Tier 5 — Reflection / runtime-resolved DI (out of scope)

_serviceProvider.GetService(Type.GetType(configString))?.Invoke(...) is genuinely impossible to resolve statically. This tier is named for completeness but explicitly out of scope. Estimated <5% of cross-service calls in practice.


3. Proposed architecture

3.1 Plugin-based service-pattern registry

internal/cbm/service_patterns.c currently hardcodes 252 patterns in a C array. Adding a new framework requires a C patch + recompile. Proposal: externalize the pattern table to a YAML / JSON registry loaded at startup.

Format example (registry-format-1.yaml — actual schema TBD with maintainer):

patterns:
  # Tier 2 — typed-message pub/sub
  - id: masstransit-publish
    languages: [csharp]
    kind: ASYNC_CALLS
    producer:
      match:
        type_implements: IPublishEndpoint
        method_pattern: "Publish<T>(...)"
      extract_id_from: generic_type_arg_or_first_arg_type
      id_kind: message_fqn
      broker: rabbitmq
    consumer:
      match:
        class_implements: "IConsumer<T>"
      extract_id_from: generic_type_arg
      id_kind: message_fqn

  - id: spring-cloud-stream-handler
    languages: [java, kotlin]
    kind: ASYNC_CALLS
    producer:
      match:
        method_calls: "streamBridge.send"
        first_arg_kind: string_literal
      extract_id_from: first_arg
      id_kind: channel_name
    consumer:
      match:
        method_annotation: "@StreamListener"
      extract_id_from: annotation_value
      id_kind: channel_name

  - id: refit-client
    languages: [csharp]
    kind: HTTP_CALLS
    producer:
      match:
        interface_method_attribute: "[Get|Post|Put|Delete|Patch]"
      extract_id_from: attribute_arg
      id_kind: http_route
    consumer:
      match:
        method_attribute: "[HttpGet|HttpPost|HttpPut|HttpDelete|HttpPatch]"
      extract_id_from: attribute_arg
      id_kind: http_route

Benefits:

  • Adding Wolverine, Watermill, or any new framework is one YAML entry, not a code patch + release cycle
  • Maintainer review surface drops dramatically (review YAML, not C)
  • Community contributions become low-risk (a YAML PR can't crash the binary)
  • Multi-language patterns compose naturally (one ID matches both Java and Kotlin via languages: [java, kotlin])

Existing 252 patterns in service_patterns.c can be migrated to YAML in a separate cleanup PR (no behavior change, pure refactor) — out of scope for this proposal but a natural follow-on.

3.2 Pipeline integration

Two changes to the existing pipeline:

  1. New pass: pass_idl_scan — runs once per repo before pass_definitions. Scans for IDL files (.proto, .graphql, openapi.yaml, asyncapi.yaml) and emits canonical Route nodes derived from them. Each Route gets a stable QN like __idl_route__grpc__promocode.PromoCodeManagerGrpcService/GetVoucher regardless of which language consumes it.

  2. Extend pass_calls.c emit_classified_edge — when matching against the new YAML-driven patterns, support extracting identifiers from:

    • Generic type parameters (Publish<T>)
    • Constructor argument types (Publish(new T(...)))
    • Interface-method attributes ([Get("/x")])
    • Class-level attributes (@FeignClient(name="x"))
    • Combined with config-resolved values (Tier 4)
  3. Auto-trigger cross-repo pass for workspace siblings — when a repo is part of a workspace (e.g., cross-repo-intelligence mode is invoked once with target_projects: ["*"]), persist the workspace membership in the artifact, and on subsequent re-indexes auto-fire cross-repo matching against the same sibling set.

3.3 Cross-repo extension

The existing cbm_cross_repo_match already supports topic-based matching. Two extensions:

  1. Add match_by_message_fqn — phase D (after HTTP / Async / Channel matching). For each ASYNC_CALLS edge with message_fqn property, find consumer-side IConsumer<message_fqn> registrations in target DBs and emit CROSS_ASYNC_CALLS edges.

  2. Add match_by_grpc_method — phase E. For each gRPC client call with service.method identifier, find consumer-side *Base overrides of the same service.method and emit CROSS_GRPC_CALLS edges. Reuses the existing CROSS_GRPC_CALLS edge type and emission helper at pass_cross_repo.c:657.

Both extensions reuse the existing route-matching scaffolding (emit_cross_route_bidirectional). Pure additive code paths.


4. Tier 1 detailed spec — gRPC .proto matching

Proposed as the first PR to validate the architecture. Smallest scope, highest universality (8+ languages), zero framework variance (.proto syntax is standardized by Google).

4.1 Producer-side extraction

Detect references to generated gRPC client types. The detection signal is the type name pattern, not call-site strings:

  • C#: classes/interfaces ending in Client derived from Grpc.Core.ClientBase<T> (generated by Grpc.Tools)
  • Go: structs with *grpc.ClientConn field + methods matching .proto service methods
  • Python: classes from *_pb2_grpc.py ending in Stub
  • Java/Kotlin: classes ending in *Grpc.*Stub (generated by protoc-gen-grpc-java)
  • TypeScript: classes from *_pb_grpc.d.ts with the right shape
  • Rust: tonic-generated *Client structs

For each method call on a generated client type:

  1. Resolve the client type to its underlying service.method pair (recoverable from .proto — see §4.3)
  2. Emit a CALLS edge with new properties: {rpc_kind: "grpc", service: "promocode.PromoCodeManagerGrpcService", method: "GetVoucher"}

4.2 Consumer-side extraction

Detect classes implementing the generated gRPC server-base type:

  • C#: : PromoCodeManagerGrpcServiceBase
  • Go: structs with method receivers matching the unimplemented server interface
  • Python: classes inheriting *Servicer
  • Java: extends *ImplBase
  • Rust: impl *Server for ...

For each override of a service method, emit a Route node with QN __idl_route__grpc__<service>/<method> and a HANDLES edge from the implementing class.

4.3 IDL parsing

pass_idl_scan reads .proto files (anywhere in the repo by default; configurable) and builds the canonical service.methodpackage.Service.method mapping. Tree-sitter has a maintained tree-sitter-proto grammar. ~150 LOC for the parser + AST walk + node emission.

4.4 Cross-repo matching

Add match_grpc_calls as Phase E in cbm_cross_repo_match:

/* For each producer-side CALLS edge with rpc_kind=grpc:
 *   1. Look up service+method in target project's IDL-derived Routes
 *   2. If found, find the HANDLES edge from a *Base class
 *   3. Emit CROSS_GRPC_CALLS bidirectional edge
 */

Reuses emit_cross_route_bidirectional and the existing CROSS_GRPC_CALLS edge type. ~100 LOC.

4.5 Estimated diff size

Component Files LOC
pass_idl_scan.c (new) 1 ~150
service_patterns.c (gRPC client/server type recognizers) 1 ~50
pass_calls.c extension for typed-client RPC properties 1 ~60
pass_cross_repo.c Phase E 1 ~100
Tests + fixtures (multi-language) several ~200
Total ~560 LOC

Larger than my earlier 300-LOC estimate — that didn't include tests. Production-grade with tests is ~560.

4.6 Test fixtures (multi-language)

Three tiny repos shipped under testdata/cross-repo/grpc/:

  • service-a-csharp/ — minimal .NET project consuming service-b's gRPC client
  • service-b-go/ — minimal Go gRPC server implementing the .proto from contracts/
  • service-c-python/ — minimal Python consumer of service-b's gRPC service
  • contracts/ — single .proto file shared by all three

Test asserts: after indexing all four (contracts/ first, then services), cbm_cross_repo_match emits CROSS_GRPC_CALLS edges from a-csharp and c-python to b-go's handler classes, with correct service+method properties.

4.7 Success criteria

Metric Target
Precision on test fixtures 100% (deterministic — gRPC has no naming ambiguity)
Recall on test fixtures 100% — all known cross-service calls detected
Index-time overhead <5% additional time per repo with .proto files
Index-time overhead per repo without .proto 0% (pass_idl_scan is no-op)
Memory overhead proportional to .proto count, ~1KB per service definition
Backwards compatibility All existing tests pass; existing CROSS_* edges unchanged

5. Roadmap — Tiers 2–4

Each tier is a separate PR after Tier 1 lands. Sequence chosen by descending universality and ascending implementation complexity.

5.1 Tier 2 — typed message pub/sub (after Tier 1)

Scope: introduce the YAML-driven service-pattern registry; ship initial registry covering MassTransit (C#), Spring Cloud Stream (Java/Kotlin), and NestJS (TS) as proof of multi-language genericity. Add pass_message_synthesis that emits ASYNC_CALLS edges keyed by message_fqn instead of requiring a topic literal. Extend pass_cross_repo Phase D to match by message_fqn.

Estimated LOC: ~800 (registry loader, YAML schema, three framework definitions, new pass, cross-repo extension, tests).

Risk: brittleness on framework-version drift (MassTransit v8 vs v7 have slightly different interface shapes). Mitigation: registry entries can be version-tagged; pattern matching tolerates shape variance.

5.2 Tier 3 — attribute-driven routes (after Tier 2)

Scope: extend pass_route_nodes.c to extract routes from interface-method attributes (Refit / Retrofit / Feign) on the producer side. Match against existing controller-side attribute extraction. Most attribute-driven controller patterns are already detected by cbm — this tier closes the producer-side gap.

Estimated LOC: ~400.

Risk: low. Attribute syntax is declarative and stable across framework versions.

5.3 Tier 4 — config-resolved service discovery (after Tier 3)

Scope: extend pass_envscan to also parse appsettings.json, application.yaml, helm values, kustomize overlays. Build named-client → base-URL maps. Add light intra-method dataflow to resolve path = $"/api/{x}" patterns. Combine with named-client resolution to reconstruct full URLs.

Estimated LOC: ~1200. Largest tier — config parsing across multiple ecosystems is genuinely complex.

Risk: medium-high. Variable resolution can produce false positives; mitigation is confidence scoring on the emitted edges (high confidence when literal, lower when resolved through 2+ hops).

5.4 Combined coverage estimate

After Tiers 1–3 land (Tier 4 is bonus), realistic recall on cross-service edges in modern strongly-typed codebases:

Code style Estimated recall
Go + gRPC + literal HTTP URLs 95%+ (Tier 1 alone covers most)
Java/Spring + Feign + Cloud Stream 90%+ (Tiers 1+2+3)
.NET / CQRS + MediatR + MassTransit + gRPC 90%+ (Tiers 1+2; HttpClient gap = Tier 4)
TypeScript / NestJS + microservices 85%+ (Tiers 1+2+3)
Python / FastAPI + Celery + httpx-codegen 85%+
Plain Python/Node with literal URLs (today's recall) unchanged, still works

6. Risks and mitigations

Risk Likelihood Mitigation
Tree-sitter pattern brittleness across language versions Medium YAML registry allows per-version patterns; tests cover N-1 and N versions of each framework
YAML registry becomes a maintenance burden Medium Limit official registry to top 10 frameworks per language; community contributions land via PR review with required test fixtures
False-positive cross-repo edges from name collisions Low Confidence scoring on each edge; collisions reported in cbm_cross_repo_result_t.collisions[]
Increased index time Low New passes are conditional (no .proto files = no IDL pass); benchmarks on every PR
Variable URL resolution (Tier 4) produces wrong routes Medium Confidence scoring; only emit cross-repo edge if resolved confidence > 0.7; consumer-side validation catches bad matches
Reflection / runtime-resolved DI is impossible High but acknowledged Explicitly out of scope (Tier 5); document as known limitation
Maintainer-burden objection Medium Plugin registry shifts most additions to YAML; core C surface area kept minimal
Patch size scares reviewers High for big-bang, Low for tier-by-tier Submit Tier 1 first as standalone PR; subsequent tiers reference Tier 1's architecture

7. Open questions for the maintainer

  1. Pattern-registry format preference: YAML, TOML, JSON, or compiled-in C tables with a build-time generator? YAML is most readable but adds a YAML parser to runtime; TOML or JSON minimize parser surface.

  2. Where should IDL files be discovered: walk-the-repo by default, or require explicit idl_paths config? Walk-the-repo has zero-config UX cost but may pick up vendored proto files in node_modules or vendor/. Suggest default-walk + standard exclusion list.

  3. Cross-repo auto-trigger model: store workspace membership in the per-repo artifact, or in a separate workspace-level artifact? Per-repo is simpler but duplicates state; workspace-level is cleaner but adds a new artifact kind.

  4. Confidence scoring: should cross-repo edges carry a confidence property explicitly, or rely on the existing properties JSON blob? A first-class confidence field makes downstream consumers' job easier.

  5. Existing pattern table migration: should the 252 patterns in service_patterns.c migrate to the YAML registry as part of this work, or stay in C with the registry only handling new patterns? Recommendation: keep C patterns as-is for v1, migrate in a separate cleanup PR after the YAML schema is proven stable.

  6. Tier 4 dataflow scope: how aggressive should intra-method variable resolution be? Single-assignment + string-concat is safe; following data through helper methods gets harder. Suggest single-method scope for v1.

  7. Test-fixture monorepo strategy: ship the multi-language fixtures in the cbm repo, or reference an external cbm-test-fixtures repo to keep the main repo small? The fixtures total ~5MB across 3-4 languages — manageable in-tree.


8. Why this is worth merging upstream

cbm's competitive position vs. Sourcegraph / Backstage / Apollo:

  • Sourcegraph does cross-repo references but per-symbol, not protocol-aware. cbm + Tier 1-3 would be the only AST-based tool emitting structured CROSS_GRPC_CALLS / CROSS_ASYNC_CALLS edges keyed by protocol identifiers.
  • Backstage does service-graph from declarative IDL files but requires manual catalog upkeep. cbm + this proposal derives the service graph automatically from the same IDL files plus the consuming code.
  • Apollo Studio does federated GraphQL via @key matching. cbm + this proposal generalizes the same idea to gRPC, OpenAPI, AsyncAPI, and typed-message ecosystems.

Position: cbm becomes the only single-binary, AST+LSP-grade tool that derives a complete service interaction graph automatically from source. That's a defensible product position.

The capability is asked for in every code-graph tool's roadmap (often as "service mesh visualization" or "API surface discovery"). cbm has the structural advantage to ship it first.


9. Appendix — example YAML registry entries

Full registry entries for the ten frameworks Tier 2 should ship with:

patterns:
  # ── C# / .NET ──────────────────────────────────────────────────
  - id: masstransit-publish
    languages: [csharp]
    kind: ASYNC_CALLS
    producer:
      match: { type_implements: "IPublishEndpoint", method: "Publish" }
      extract_id_from: generic_arg_or_first_arg_type
      id_kind: message_fqn
      broker: rabbitmq
    consumer:
      match: { class_implements: "IConsumer<T>" }
      extract_id_from: generic_type_arg
      id_kind: message_fqn

  - id: masstransit-send
    languages: [csharp]
    kind: ASYNC_CALLS
    producer:
      match: { type_implements: "ISendEndpoint", method: "Send" }
      extract_id_from: generic_arg_or_first_arg_type
      id_kind: message_fqn
      broker: rabbitmq
    consumer: { same_as: masstransit-publish.consumer }

  - id: nservicebus-publish
    languages: [csharp]
    kind: ASYNC_CALLS
    producer:
      match: { type_implements: "IMessageSession", method: "Publish" }
      extract_id_from: first_arg_type
      id_kind: message_fqn
    consumer:
      match: { class_implements: "IHandleMessages<T>" }
      extract_id_from: generic_type_arg
      id_kind: message_fqn

  # ── Java / Kotlin / Spring ─────────────────────────────────────
  - id: spring-cloud-stream-bridge
    languages: [java, kotlin]
    kind: ASYNC_CALLS
    producer:
      match: { type_implements: "StreamBridge", method: "send" }
      extract_id_from: first_arg
      id_kind: channel_name
    consumer:
      match: { method_annotation: "@StreamListener" }
      extract_id_from: annotation_value
      id_kind: channel_name

  - id: axon-command
    languages: [java, kotlin]
    kind: ASYNC_CALLS
    producer:
      match: { type_implements: "CommandGateway", method: "send" }
      extract_id_from: first_arg_type
      id_kind: message_fqn
    consumer:
      match: { method_annotation: "@CommandHandler" }
      extract_id_from: parameter_type
      id_kind: message_fqn

  # ── Node / TypeScript ──────────────────────────────────────────
  - id: nestjs-message-pattern
    languages: [typescript]
    kind: ASYNC_CALLS
    producer:
      match: { method_annotation: "@MessagePattern" }
      extract_id_from: annotation_value
      id_kind: message_pattern
    consumer:
      match: { method_annotation: "@MessagePattern" }
      extract_id_from: annotation_value
      id_kind: message_pattern

  - id: nestjs-event-pattern
    languages: [typescript]
    kind: ASYNC_CALLS
    producer:
      match: { method: "emit", type_implements: "ClientProxy" }
      extract_id_from: first_arg
      id_kind: event_pattern
    consumer:
      match: { method_annotation: "@EventPattern" }
      extract_id_from: annotation_value
      id_kind: event_pattern

  # ── Python ──────────────────────────────────────────────────────
  - id: faust-agent
    languages: [python]
    kind: ASYNC_CALLS
    producer:
      match: { method_call: "topic.send" }
      extract_id_from: receiver_var_topic_name
      id_kind: kafka_topic
    consumer:
      match: { decorator: "@app.agent" }
      extract_id_from: decorator_arg
      id_kind: kafka_topic

  # ── Go ──────────────────────────────────────────────────────────
  - id: watermill-publish
    languages: [go]
    kind: ASYNC_CALLS
    producer:
      match: { type_implements: "message.Publisher", method: "Publish" }
      extract_id_from: first_arg
      id_kind: topic_name
    consumer:
      match: { type_implements: "message.Subscriber", method: "Subscribe" }
      extract_id_from: first_arg
      id_kind: topic_name

  # ── Rust ───────────────────────────────────────────────────────
  - id: async-nats-publish
    languages: [rust]
    kind: ASYNC_CALLS
    producer:
      match: { method: "publish", type_implements: "Client" }
      extract_id_from: first_arg
      id_kind: nats_subject
    consumer:
      match: { method: "subscribe", type_implements: "Client" }
      extract_id_from: first_arg
      id_kind: nats_subject

Schema notes:

  • match block defines the AST-pattern selector (interface implementation, attribute presence, method-call shape)
  • extract_id_from names a strategy from a fixed enum (generic_type_arg, first_arg, first_arg_type, annotation_value, attribute_arg, receiver_var_topic_name, etc.)
  • id_kind declares the namespace of the extracted identifier (so kafka_topic from one framework matches kafka_topic from another, but never matches message_fqn)
  • broker is optional metadata that flows into the emitted edge

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions