ConversationLearner: cross-conversation recurrence aggregation#127
Merged
weiyilong-1 merged 2 commits intoJun 20, 2026
Conversation
Merge per-conversation proposals that share an identity (asset type + canonical name + gap type) into one learning instead of keeping only the highest-confidence instance and discarding the rest: - Roll up the recurrence signal: occurrence_count, source_conversation_ids, aggregated supporting_evidence, and first_seen/last_seen, from each contributing conversation. Provenance is tagged at the map stage and stripped before save. - confidence_grade becomes a recurrence-boosted noisy-OR over instances; max_instance_confidence preserves the best single-instance grade. The review UI gains a 'recurring xN' badge and per-conversation evidence. Tests updated for the merge semantics + a noisy-OR suite; README updated.
Add a 'Min occurrences' sidebar slider and a 'Sort by' selector (Occurrences by default, Confidence optional). The review queue now sorts most-recurring (then highest-confidence) first, so high-recurrence learnings surface above the 50-card cap instead of being hidden in proposal.json order, and can be filtered by a minimum occurrence threshold.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Cross-conversation aggregation for ConversationLearner. Same-identity proposals (asset type + canonical name + gap type) are now merged into a single learning that carries the recurrence signal, instead of keeping only the highest-confidence instance and discarding the rest.
Changes
_aggregate_proposalsrewritten to group by identity and merge via a new_merge_group; adds_noisy_or.occurrence_count,source_conversation_ids, aggregatedsupporting_evidence, andfirst_seen/last_seen.confidence_gradebecomes a recurrence-boosted noisy-OR over instances (1 - prod(1 - c_i));max_instance_confidencepreserves the best single-instance grade.generate_learningstags each proposal with its source conversation (id + timestamp) at the map stage; the transient key is stripped before save. The summary reports how many proposals recur across conversations.recurring xNbadge on each card, and the Evidence expander lists per-conversation supporting evidence.TestAggregateProposalsupdated for the merge semantics (occurrence counting, provenance roll-up, distinct-conversation counting) plus a newTestNoisyOr.Redaction is unchanged: the new fields are built before the existing
_redact_objsave pass (so aggregated evidence is still redacted), and_provenanceis stripped before save.Test plan
python -m unittest conversation_learner.tests.test_agent-> 93 pass;conversation_learner.tests.test_review_store-> 10 pass.occurrence_count/supporting_evidence/ noisy-ORconfidence_gradepopulated correctly, then loaded into the Streamlit review UI.