Summary
Use pgvector embeddings to merge or relate clusters that are semantically similar but diverge in text (different templates, minor wording), improving explain/timeline/compare quality.
Motivation
- README Roadmap lists semantic cluster merging.
- Fingerprinting alone can split one incident across multiple clusters when normalization does not fully align.
Scope (proposal)
Acceptance criteria
- Measurable reduction in duplicate “near duplicate” clusters on sample data without breaking grounded counts (or clearly documented semantics).
- Tests that do not require live embedding API where possible (fixtures / mocks).
Risks
- Merging can hide distinct errors if thresholds are wrong; make thresholds configurable and conservative defaults.
Summary
Use pgvector embeddings to merge or relate clusters that are semantically similar but diverge in text (different templates, minor wording), improving explain/timeline/compare quality.
Motivation
Scope (proposal)
explain/clusters.Acceptance criteria
Risks