ConversationLearner: cross-conversation recurrence aggregation by weiyilong-1 · Pull Request #127 · GoogleCloudPlatform/knowledge-catalog

weiyilong-1 · 2026-06-20T22:49:42Z

What

Cross-conversation aggregation for ConversationLearner. Same-identity proposals (asset type + canonical name + gap type) are now merged into a single learning that carries the recurrence signal, instead of keeping only the highest-confidence instance and discarding the rest.

Changes

agent.py
- _aggregate_proposals rewritten to group by identity and merge via a new _merge_group; adds _noisy_or.
- Merged proposals gain occurrence_count, source_conversation_ids, aggregated supporting_evidence, and first_seen / last_seen.
- confidence_grade becomes a recurrence-boosted noisy-OR over instances (1 - prod(1 - c_i)); max_instance_confidence preserves the best single-instance grade.
- generate_learnings tags each proposal with its source conversation (id + timestamp) at the map stage; the transient key is stripped before save. The summary reports how many proposals recur across conversations.
review_app.py - a recurring xN badge on each card, and the Evidence expander lists per-conversation supporting evidence.
tests/test_agent.py - TestAggregateProposals updated for the merge semantics (occurrence counting, provenance roll-up, distinct-conversation counting) plus a new TestNoisyOr.
README.md - cross-conversation bullet updated.

Redaction is unchanged: the new fields are built before the existing _redact_obj save pass (so aggregated evidence is still redacted), and _provenance is stripped before save.

Test plan

python -m unittest conversation_learner.tests.test_agent -> 93 pass; conversation_learner.tests.test_review_store -> 10 pass.
Verified end-to-end against a live Reasoning Engine (one day): 745 raw -> 408 deduplicated proposals, 99 recurring across multiple conversations, with occurrence_count / supporting_evidence / noisy-OR confidence_grade populated correctly, then loaded into the Streamlit review UI.

Merge per-conversation proposals that share an identity (asset type + canonical name + gap type) into one learning instead of keeping only the highest-confidence instance and discarding the rest: - Roll up the recurrence signal: occurrence_count, source_conversation_ids, aggregated supporting_evidence, and first_seen/last_seen, from each contributing conversation. Provenance is tagged at the map stage and stripped before save. - confidence_grade becomes a recurrence-boosted noisy-OR over instances; max_instance_confidence preserves the best single-instance grade. The review UI gains a 'recurring xN' badge and per-conversation evidence. Tests updated for the merge semantics + a noisy-OR suite; README updated.

Add a 'Min occurrences' sidebar slider and a 'Sort by' selector (Occurrences by default, Confidence optional). The review queue now sorts most-recurring (then highest-confidence) first, so high-recurrence learnings surface above the 50-card cap instead of being hidden in proposal.json order, and can be filtered by a minimum occurrence threshold.

weiyilong-1 added 2 commits June 20, 2026 22:48

weiyilong-1 merged commit 4f068d9 into GoogleCloudPlatform:main Jun 20, 2026
6 checks passed

weiyilong-1 deleted the conversation-learner-recurrence-aggregation branch June 20, 2026 23:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ConversationLearner: cross-conversation recurrence aggregation#127

ConversationLearner: cross-conversation recurrence aggregation#127
weiyilong-1 merged 2 commits into
GoogleCloudPlatform:mainfrom
weiyilong-1:conversation-learner-recurrence-aggregation

weiyilong-1 commented Jun 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

weiyilong-1 commented Jun 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Changes

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

weiyilong-1 commented Jun 20, 2026 •

edited

Loading