perf: replace B-tree CONTAINS scan with fulltext index for search()#17
perf: replace B-tree CONTAINS scan with fulltext index for search()#17aksOps wants to merge 2 commits into
Conversation
…STENS edges Tibco EMS, Azure Service Bus/Event Hub, and Spring Events emit different edge kinds than Kafka/RabbitMQ. TopicLinker previously only matched PRODUCES/CONSUMES, silently dropping cross-service CALLS edges for all three messaging patterns. - Add SENDS_TO and RECEIVES_FROM (Tibco/Azure) as producer/consumer edges - Add PUBLISHES and LISTENS (Spring Events) as producer/consumer edges - Add EVENT and MESSAGE_QUEUE node kinds to topic matching (alongside TOPIC/QUEUE) - Add 4 new test cases: SENDS_TO/RECEIVES_FROM, PUBLISHES/LISTENS, MESSAGE_QUEUE, determinism Co-Authored-By: Paperclip <noreply@paperclip.ing>
B-tree indexes on label_lower/fqn_lower cannot serve CONTAINS queries in Neo4j — every search caused a full graph scan. Replace with a fulltext index using the keyword analyzer so wildcard (*text*) queries are backed by an index. - Add FULLTEXT INDEX search_index on (n.label_lower, n.fqn_lower) in both GraphStore.bulkSave() and EnrichCommand secondary-index block - Use keyword analyzer to preserve whole-property tokens (avoids Lucene tokenisation splitting FQNs on dots) - Replace search() CONTAINS queries with db.index.fulltext.queryNodes() + *text* wildcard wrapping - Escape Lucene special characters before wrapping in toLuceneQuery() - Add CALL db.awaitIndexes(300) after secondary index creation in EnrichCommand so the first search request hits the index Fixes RAN-66 Co-Authored-By: Paperclip <noreply@paperclip.ing>
Code reviewFound 2 issues:
The fulltext index is created on Fix: add
The escape list covers 15 Lucene special characters but omits Fix: add 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
Code reviewNo issues found. Checked for bugs and CLAUDE.md compliance. 🤖 Generated with Claude Code |
|
Closing as chain/review PR — superseded by #16 which targets main directly. All changes from this branch are now in main. |
Summary
Fixes RAN-66.
B-tree indexes on
label_lower/fqn_lowercannot serveCONTAINSpredicates in Neo4j — every call tosearch()caused a full graph scan regardless of the index. This replaces theCONTAINSqueries with a fulltext index backed by Lucene.search_index) created on(n.label_lower, n.fqn_lower)using thekeywordanalyzer — preserves whole-property tokens so FQNs with dots aren't split by the standard tokeniserGraphStore.bulkSave()— createssearch_indexalongside the existing B-tree indexes (used by the serve-path bootstrap)EnrichCommand— createssearch_indexin the secondary-index block and addsCALL db.awaitIndexes(300)so the index is fully ready before the first querysearch(String, int)andsearch(String)— usedb.index.fulltext.queryNodes()with*text*wildcard wrapping for substring matchingtoLuceneQuery()helper — lowercases input and escapes Lucene special characters before wrappingTest plan
GraphStoreTest#shouldSearch— passes (mocked, verifies plumbing)GraphStoreExtendedTest#shouldSearchWithLimit— passesEnrichCommandTest— all 3 tests passmvn test— 1473 tests, 0 failures🤖 Generated with Claude Code