feat: add experiment-gated self-improving learning, bounded memory, background curation, and auto-create/update skill#252
Conversation
- Introduced SelfImprovingManager to facilitate background learning from task outcomes. - Added LearningStore, FeedbackCollector, PatternAnalyzer, ImprovementApplier, and CodeIndexAdapter for modular functionality. - Implemented experiment gating for enabling/disabling self-improvement features. - Created comprehensive tests for SelfImprovingManager to ensure functionality and stability. - Updated localization files to include settings for self-improvement in multiple languages.
…re integration - Introduced MemoryStore for managing memory entries and snapshots. - Added SkillUsageStore for tracking skill usage telemetry. - Implemented ActionExecutor to handle action execution and logging. - Updated SelfImprovingManager to initialize and utilize MemoryStore and SkillUsageStore. - Enhanced telemetry reporting in SelfImprovingManager's status. - Added tests for ActionExecutor, MemoryStore, and SkillUsageStore to ensure functionality. - Updated types to include new SkillUsageStore and MemoryStore interfaces.
…rovement - Added ReviewPromptFactory to generate structured review prompts for memory and skill reviews. - Introduced TranscriptRecall to store and manage transcript entries for task outcomes. - Enhanced SelfImprovingManager to utilize ReviewPromptFactory and TranscriptRecall. - Updated CuratorService to support new functionality and maintain skill usage. - Created unit tests for ReviewPromptFactory, TranscriptRecall, and CuratorService to ensure reliability. - Modified types and index files to include new services and types.
…settings management
…oryBackendFactory
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds an experiment-gated "self-improving" subsystem: Zod types, memory backends, persistent LearningStore, event→pattern analysis, action generation/execution, a SelfImprovingManager orchestrator with timers/curator, prompt injection, provider integration, tests, and i18n strings. ChangesSelf-Improving Learning Subsystem
Estimated code review effort: 🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers:
✨ Finishing Touches🧪 Generate unit tests (beta)
|
There was a problem hiding this comment.
Actionable comments posted: 14
🧹 Nitpick comments (2)
src/services/self-improving/types.ts (1)
78-78: ⚡ Quick winTighten
getExperimentsto the sharedExperimentscontract.Line 78 uses
Record<string, boolean>, which allows misspelled keys and bypasses compile-time checks for experiment flags.Suggested patch
import type { ActionType, + Experiments, FeedbackSignal, @@ - getExperiments: () => Record<string, boolean> | undefined + getExperiments: () => Experiments | undefined🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/self-improving/types.ts` at line 78, Tighten the getExperiments signature to return the shared Experiments contract instead of a loose Record<string, boolean>; update the type of getExperiments from "Record<string, boolean> | undefined" to "Experiments | undefined", import or reference the shared Experiments type and adjust any callers of getExperiments to use the canonical keys defined by Experiments (e.g., update types in the exported interface where getExperiments is declared and ensure usage sites validate against the Experiments shape).src/services/self-improving/SelfImprovingManager.ts (1)
142-166: 💤 Low valueConsider disposing all initialized stores for completeness.
skillUsageStore,curatorService, andtranscriptRecallare initialized unconditionally in the constructor but are not disposed indispose(). If these stores hold file handles or other resources, they may leak when the manager is disposed.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/services/self-improving/SelfImprovingManager.ts` around lines 142 - 166, The dispose() method currently persists and disposes memoryStore but misses disposing other stores initialized in the constructor; update dispose() to also gracefully persist/close and dispose skillUsageStore, curatorService, and transcriptRecall (if they exist)—for example, call any available persist/flush methods before calling their dispose/close methods, using optional chaining or instanceof checks (e.g., this.skillUsageStore?.dispose?.(), this.curatorService?.dispose?.(), this.transcriptRecall?.dispose?.()) and ensure these calls occur inside the try block alongside memoryStore disposal so resources are released even on errors.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/types/src/__tests__/learning-memory.test.ts`:
- Line 1: The test file header contains a stale Vitest command and should point
to the test's real path; update the comment on line 1 from "npx vitest run
src/__tests__/learning-memory.test.ts" to reference the correct location
"packages/types/src/__tests__/learning-memory.test.ts" (or use a
workspace-root-friendly command) so local runs target the actual file.
In `@packages/types/src/vscode-extension-host.ts`:
- Around line 364-372: Update the selfImprovingStatus type in
packages/types/src/vscode-extension-host.ts so it matches the object returned by
SelfImprovingManager.getStatus(): add the missing fields memoryEntries,
memoryBackend, skillRecords, and curatorStatus (with the same shapes/types used
by getStatus(), e.g., memoryEntries: number, memoryBackend?: string,
skillRecords: number, and curatorStatus as the status object returned by the
manager). Preferably import or reference the getStatus() return/interface if
available to keep types in sync and avoid duplication; otherwise add fields with
the concrete types used by SelfImprovingManager.getStatus().
In `@src/services/self-improving/__tests__/MemoryBackendFactory.spec.ts`:
- Around line 21-23: The test is not exercising the unknown-backend fallback
because it passes "builtin"; update the spec so MemoryBackendFactory.create is
invoked with an unsupported value (e.g., "unknown" or "unsupported") instead of
"builtin", then assert the returned backend is a MemoryStore; specifically
change the call to MemoryBackendFactory.create("unsupported" as any, baseDir,
logger) and keep the expect(backend).toBeInstanceOf(MemoryStore) assertion to
validate the fallback behavior.
In `@src/services/self-improving/__tests__/SkillUsageStore.spec.ts`:
- Around line 20-33: The test currently uses a fixed sleep after calling
SkillUsageStore.getOrCreate which is flaky; replace the setTimeout delay in
SkillUsageStore.spec.ts with a deterministic polling loop that waits for the
persisted file (self-improving/skill-usage.json) to exist and contain the
expected entry within a bounded timeout (e.g., 1–2s), checking repeatedly (using
fs.stat or readFile) and throwing if the timeout elapses; keep assertions the
same and reference SkillUsageStore.getOrCreate and the persisted file path when
implementing the poll.
In `@src/services/self-improving/AgentMemoryAdapter.ts`:
- Around line 191-199: The forgetByContent function currently will match
everything if substring is empty because
entry.content.toLowerCase().includes(substring.toLowerCase()) is true for "", so
add an early guard in forgetByContent to return 0 (or reject) when substring is
empty or only whitespace; check substring.trim().length === 0 before calling
this.search and bail out. Keep the existing case-insensitive check and the loop
that calls this.forget(entry.id) unchanged; reference the forgetByContent
method, the substring parameter, and the includes(...) check when making the
change.
In `@src/services/self-improving/CuratorService.ts`:
- Around line 118-121: The current shouldRun() condition uses this.firstRunDone,
causing shouldRun() to always return false before run() ever executes and
creating a deadlock; remove the dependency on this.firstRunDone from the gating
logic so shouldRun() no longer permanently denies the first execution —
specifically update the condition in shouldRun() (the method named shouldRun in
CuratorService) to not include "!this.firstRunDone" (keep deferral behavior
based on config and lastRunAt if needed) and leave setting this.firstRunDone
inside run() unchanged.
In `@src/services/self-improving/LearningStore.ts`:
- Around line 186-191: The persist() implementation currently writes STATE_FILE,
pattern files, archive files and the index in parallel (via safeWriteJson,
persistPatternFiles, persistPatternFiles, writePatternIndex), which can leave
files inconsistent on partial failure; change persist() to first persist all
pattern artifacts and the index (call persistPatternFiles for this.patternsDir
and this.archiveDir and writePatternIndex sequentially or in a grouped
Promise.all) and only after those complete successfully write the authoritative
state.json last using safeWriteJson (preferably write to a temp file and
atomically rename to STATE_FILE) so recovery reads one clear source of truth;
update function names referenced: persist(), safeWriteJson, persistPatternFiles,
writePatternIndex, and STATE_FILE.
In `@src/services/self-improving/MemoryStore.ts`:
- Around line 99-104: The recall method in MemoryStore concatenates
this.environment and this.userProfile then slices the tail, which can drop newer
entries if array order differs; modify the recall function (after await
this.ensureInitialized()) to merge both arrays, sort the combined array globally
by each entry's timestamp (newest first) and then take the top maxResults before
mapping to this.cloneEntry so results are globally time-accurate; use the
existing variables this.environment, this.userProfile, the recall function and
cloneEntry method to locate and change the logic.
- Around line 126-138: In forgetByContent, calling .includes on an empty string
will match every entry and wipe both stores; add a guard at the start of the
async forgetByContent(substring: string) method (after await
this.ensureInitialized()) to treat empty or whitespace-only substring safely
(e.g., if (!substring || substring.trim() === "") return 0 or throw a specific
error), so that environment and userProfile arrays are not filtered when
substring is empty; reference the forgetByContent function and the environment
and userProfile properties when making this change.
In `@src/services/self-improving/PatternAnalyzer.ts`:
- Around line 124-126: The merge logic in PatternAnalyzer that finds existing
patterns using pattern.summary.includes(toolKey) is fragile; change it to use a
structured context key instead (e.g., check pattern.context?.toolKey === toolKey
or pattern.context?.tools?.includes(toolKey) depending on how tools are
represented) so merges are exact and deterministic; update both the find at the
existingPatterns.find call (patternType "tool") and the similar check around
lines 188-190 to use the structured context property (pattern.context) rather
than substring matching on summary, and ensure any pattern
creation/serialization code sets that context key accordingly.
- Around line 193-199: When updating an existing tool-preference pattern in
PatternAnalyzer (the patterns.push block that spreads existing), preserve
cumulative frequency instead of overwriting it with the current batch count;
change the frequency assignment to use existing.frequency + total (e.g.,
frequency: (existing.frequency ?? 0) + total) so historical evidence is
retained, keep lastSeenAt, successRate and confidenceScore updates as-is, and
ensure you reference the same existing object/variable when computing the new
cumulative value.
In `@src/services/self-improving/TranscriptRecall.ts`:
- Around line 56-67: The record() method mutates and persists this.entries
without ensuring the instance is initialized; call await this.initialize() at
the start of TranscriptRecall.record() (or otherwise ensure initialize()
completes) before pushing/slicing entries and calling this.persist() so prior
on-disk state and any required directories are loaded/created; update references
to this.entries, TranscriptRecall.MAX_ENTRIES, and this.persist() to assume
initialization has completed.
- Around line 119-121: When loading parsed JSON into this.entries in
TranscriptRecall (the block that currently does if (Array.isArray(parsed)) {
this.entries = parsed.slice(-TranscriptRecall.MAX_ENTRIES) }), validate and
sanitize each item before assigning: ensure parsed is an array of objects, each
entry has the expected fields (at minimum a string summary usable by
entry.summary.toLowerCase()), coerce or skip malformed entries, and then take
the last TranscriptRecall.MAX_ENTRIES of the validated list; this prevents
runtime errors during search-time operations like entry.summary.toLowerCase().
In `@src/services/self-improving/types.ts`:
- Around line 104-138: DEFAULT_CONFIG and EMPTY_STATE duplicate canonical
defaults and should be removed here and imported from the single source of
truth; replace the local declarations by importing the canonical DEFAULT_CONFIG
and EMPTY_STATE (or the canonical factory functions) used by the types layer,
ensure the file references the LearningConfig and LearningState types and any
identifiers (DEFAULT_CONFIG, EMPTY_STATE) are updated to the imported symbols,
and remove or replace any direct mutations so the single canonical definitions
drive manager behavior.
---
Nitpick comments:
In `@src/services/self-improving/SelfImprovingManager.ts`:
- Around line 142-166: The dispose() method currently persists and disposes
memoryStore but misses disposing other stores initialized in the constructor;
update dispose() to also gracefully persist/close and dispose skillUsageStore,
curatorService, and transcriptRecall (if they exist)—for example, call any
available persist/flush methods before calling their dispose/close methods,
using optional chaining or instanceof checks (e.g.,
this.skillUsageStore?.dispose?.(), this.curatorService?.dispose?.(),
this.transcriptRecall?.dispose?.()) and ensure these calls occur inside the try
block alongside memoryStore disposal so resources are released even on errors.
In `@src/services/self-improving/types.ts`:
- Line 78: Tighten the getExperiments signature to return the shared Experiments
contract instead of a loose Record<string, boolean>; update the type of
getExperiments from "Record<string, boolean> | undefined" to "Experiments |
undefined", import or reference the shared Experiments type and adjust any
callers of getExperiments to use the canonical keys defined by Experiments
(e.g., update types in the exported interface where getExperiments is declared
and ensure usage sites validate against the Experiments shape).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 12dec26c-96df-46e2-aae0-7bc4f2cbfa8a
📒 Files selected for processing (61)
packages/types/src/__tests__/learning-memory.test.tspackages/types/src/experiment.tspackages/types/src/index.tspackages/types/src/learning.tspackages/types/src/memory.tspackages/types/src/vscode-extension-host.tssrc/__tests__/extension.spec.tssrc/core/prompts/__tests__/system-prompt.spec.tssrc/core/prompts/system.tssrc/core/task/Task.tssrc/core/webview/ClineProvider.tssrc/core/webview/generateSystemPrompt.tssrc/core/webview/webviewMessageHandler.tssrc/extension.tssrc/services/self-improving/ActionExecutor.tssrc/services/self-improving/AgentMemoryAdapter.tssrc/services/self-improving/CodeIndexAdapter.tssrc/services/self-improving/CuratorService.tssrc/services/self-improving/FeedbackCollector.tssrc/services/self-improving/ImprovementApplier.tssrc/services/self-improving/LearningStore.tssrc/services/self-improving/MemoryBackend.tssrc/services/self-improving/MemoryBackendFactory.tssrc/services/self-improving/MemoryStore.tssrc/services/self-improving/PatternAnalyzer.tssrc/services/self-improving/ReviewPromptFactory.tssrc/services/self-improving/SelfImprovingManager.tssrc/services/self-improving/SkillUsageStore.tssrc/services/self-improving/TranscriptRecall.tssrc/services/self-improving/__tests__/ActionExecutor.spec.tssrc/services/self-improving/__tests__/AgentMemoryAdapter.spec.tssrc/services/self-improving/__tests__/CuratorService.spec.tssrc/services/self-improving/__tests__/LearningStore.spec.tssrc/services/self-improving/__tests__/MemoryBackendFactory.spec.tssrc/services/self-improving/__tests__/MemoryStore.spec.tssrc/services/self-improving/__tests__/ReviewPromptFactory.spec.tssrc/services/self-improving/__tests__/SelfImprovingManager.spec.tssrc/services/self-improving/__tests__/SkillUsageStore.spec.tssrc/services/self-improving/__tests__/TranscriptRecall.spec.tssrc/services/self-improving/index.tssrc/services/self-improving/types.tssrc/shared/__tests__/experiments.spec.tssrc/shared/experiments.tswebview-ui/src/i18n/locales/ca/settings.jsonwebview-ui/src/i18n/locales/de/settings.jsonwebview-ui/src/i18n/locales/en/settings.jsonwebview-ui/src/i18n/locales/es/settings.jsonwebview-ui/src/i18n/locales/fr/settings.jsonwebview-ui/src/i18n/locales/hi/settings.jsonwebview-ui/src/i18n/locales/id/settings.jsonwebview-ui/src/i18n/locales/it/settings.jsonwebview-ui/src/i18n/locales/ja/settings.jsonwebview-ui/src/i18n/locales/ko/settings.jsonwebview-ui/src/i18n/locales/nl/settings.jsonwebview-ui/src/i18n/locales/pl/settings.jsonwebview-ui/src/i18n/locales/pt-BR/settings.jsonwebview-ui/src/i18n/locales/ru/settings.jsonwebview-ui/src/i18n/locales/tr/settings.jsonwebview-ui/src/i18n/locales/vi/settings.jsonwebview-ui/src/i18n/locales/zh-CN/settings.jsonwebview-ui/src/i18n/locales/zh-TW/settings.json
|
Hello. Thanks for your contribution! this looks like a really cool feature! Can you please break down your PR into smaller chunks so we can review it? |
That might be challenging as it thight with each other. Any tips? |
- guard empty forgetByContent queries in both memory backends - sort recall results globally by timestamp across stores - add focused regression tests for deletion guards and recall ordering
- match tool-based patterns by structured context instead of summaries - preserve cumulative frequency for existing tool preferences - add regression tests for tool-combination and preference updates
- exercise unknown memory-backend fallback explicitly - replace fixed sleep with bounded polling in skill usage persistence test - correct the package types vitest file header path
- lazily initialize transcript recall before recording new entries - validate persisted transcript entries before loading them - add regression tests for lazy init and malformed data handling
- reuse shared learning defaults and empty state from @roo-code/types - tighten experiment access to the shared Experiments contract - expand webview selfImprovingStatus typing to match manager output
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/services/self-improving/__tests__/TranscriptRecall.spec.ts`:
- Around line 80-81: Update the test assertion in TranscriptRecall.spec.ts to
reflect that getRecent returns entries in newest-first (recency) order: change
the expectation on recall.getRecent(2).map(entry => entry.id) to assert
["entry-1", "entry-0"] so the test aligns with the intended semantics of the
getRecent function.
In `@src/services/self-improving/AgentMemoryAdapter.ts`:
- Around line 192-201: The deletion loop trims and lowercases the input into
normalized but still calls this.search(substring, 50) with the raw substring,
causing mismatches; change the search call in AgentMemoryAdapter (where
substring and normalized are defined) to use normalized instead (i.e.,
this.search(normalized, 50)) so the backend query and the local includes check
(entry.content.toLowerCase().includes(normalized)) use the same
trimmed/lowercased value.
In `@src/services/self-improving/PatternAnalyzer.ts`:
- Around line 193-198: The update currently replaces successRate when merging a
new batch into a pattern; change this to a weighted average so successRate is
cumulative: in PatternAnalyzer where you push into patterns (merging into
existing), compute combinedSuccess = (existing.successRate * existing.frequency
+ successRate * total) / (existing.frequency + total) and set successRate to
that value (using existing.frequency + total as the new frequency), while
keeping frequency, lastSeenAt, and confidenceScore updates as-is; reference the
variables existing, total, successRate and the patterns push/merge logic to
locate and apply the change.
In `@src/services/self-improving/TranscriptRecall.ts`:
- Around line 57-59: Concurrent record() calls can trigger multiple initialize()
runs and later loadFromDisk() can overwrite this.entries; fix by memoizing
initialization behind a shared promise (e.g., add this.initializingPromise).
Change initialize() to set and return this.initializingPromise (and clear
it/mark this.initialized on resolution), and in record() await
this.initializingPromise if present or assign it when calling initialize() so
all callers await the same promise; reference initialize(), record(),
loadFromDisk(), this.entries, and this.initialized when applying the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: a5505315-4816-403d-b2e7-4c5f4cd50663
📒 Files selected for processing (14)
packages/types/src/__tests__/learning-memory.test.tspackages/types/src/vscode-extension-host.tssrc/services/self-improving/AgentMemoryAdapter.tssrc/services/self-improving/MemoryStore.tssrc/services/self-improving/PatternAnalyzer.tssrc/services/self-improving/SelfImprovingManager.tssrc/services/self-improving/TranscriptRecall.tssrc/services/self-improving/__tests__/AgentMemoryAdapter.spec.tssrc/services/self-improving/__tests__/MemoryBackendFactory.spec.tssrc/services/self-improving/__tests__/MemoryStore.spec.tssrc/services/self-improving/__tests__/PatternAnalyzer.spec.tssrc/services/self-improving/__tests__/SkillUsageStore.spec.tssrc/services/self-improving/__tests__/TranscriptRecall.spec.tssrc/services/self-improving/types.ts
| expect(recall.getRecent(2).map((entry) => entry.id)).toEqual(["entry-0", "entry-1"]) | ||
| }) |
There was a problem hiding this comment.
Fix getRecent ordering assertion to reflect recency semantics.
Line 80 currently expects oldest-first order after appending a newer entry. For a getRecent API, this should typically assert newest-first (["entry-1", "entry-0"]) to avoid locking in inverted behavior.
Suggested patch
- expect(recall.getRecent(2).map((entry) => entry.id)).toEqual(["entry-0", "entry-1"])
+ expect(recall.getRecent(2).map((entry) => entry.id)).toEqual(["entry-1", "entry-0"])📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| expect(recall.getRecent(2).map((entry) => entry.id)).toEqual(["entry-0", "entry-1"]) | |
| }) | |
| expect(recall.getRecent(2).map((entry) => entry.id)).toEqual(["entry-1", "entry-0"]) | |
| }) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/services/self-improving/__tests__/TranscriptRecall.spec.ts` around lines
80 - 81, Update the test assertion in TranscriptRecall.spec.ts to reflect that
getRecent returns entries in newest-first (recency) order: change the
expectation on recall.getRecent(2).map(entry => entry.id) to assert ["entry-1",
"entry-0"] so the test aligns with the intended semantics of the getRecent
function.
| const normalized = substring.trim().toLowerCase() | ||
| if (!normalized) { | ||
| return 0 | ||
| } | ||
|
|
||
| const entries = await this.search(substring, 50) | ||
| let removed = 0 | ||
|
|
||
| for (const entry of entries) { | ||
| if (entry.content.toLowerCase().includes(normalized)) { |
There was a problem hiding this comment.
Pass the normalized value into search().
The local match uses normalized, but the backend query still uses the raw substring. Inputs like " foo " can miss search results and leave matching memories undeleted. Use the trimmed value for both steps.
Suggested fix
- const entries = await this.search(substring, 50)
+ const entries = await this.search(normalized, 50)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/services/self-improving/AgentMemoryAdapter.ts` around lines 192 - 201,
The deletion loop trims and lowercases the input into normalized but still calls
this.search(substring, 50) with the raw substring, causing mismatches; change
the search call in AgentMemoryAdapter (where substring and normalized are
defined) to use normalized instead (i.e., this.search(normalized, 50)) so the
backend query and the local includes check
(entry.content.toLowerCase().includes(normalized)) use the same
trimmed/lowercased value.
- search agentmemory with normalized forget queries - keep prompt-pattern success rates cumulative with weighted updates - add regression coverage for trimmed queries and cumulative scoring
- memoize first-time initialization behind a shared promise - prevent concurrent record calls from dropping loaded transcript entries - add regression coverage for concurrent lazy record paths
- pass trimmed substring to agentmemory search instead of lowercasing it - keep case-insensitive local verification for actual deletion - tighten regression coverage to assert trim-without-lowercase semantics
- record correction feedback when users reply to outstanding task asks - keep generic user-turn tracking intact for normal follow-up messages - add a regression test around interactive correction replies
- capture real codebase_search hit counts and top score for learning events - let self-improving accept explicit code index hit details from callers - add regression tests for the tool seam and manager event recording
- assert activation calls initializeSelfImproving through the provider - keep the coverage at the extension startup seam rather than only mocked provider construction
379f4b6 to
fd40e09
Compare
- New InsightsEngine service for pattern discovery across sessions - CuratorService rework: feedback-driven skill creation, validation, repair - SelfImprovingManager: pipeline orchestration, debouncing, scheduling - SkillUsageStore: persistence, decay, reinforcement tracking - Task integration: curation hook points, completion signals - ClineProvider: curator event forwarding - system.ts + experiments: curator feature flagging - pnpm-lock + webview-ui deps sync
… enabled, persisted config fallback
…AutoMode experiment + rm FileLock dead code
… / selfImprovingFullTrust experiments
…l error healer + experiment toggles UI
…number types, fix logger refs, strip spec file corruption
…Manager log, SelfImprovingManager unused destructure + experiment fallback
…nceService, attempt completion feedback wiring, experiment cleanup
… attempt_completion blocking, AutoModeOrchestrator pattern healing, ActionExecutor timing
…rust guard, taskCompleted reset on user message
# Conflicts: # pnpm-lock.yaml
…Service unsafe option TS error
…coring fixes + test coverage
…actoryService + ReviewTeamService + ClineProvider wiring
…rningStore, TrustService, QuestionEvaluatorService updates + Experiment UI wiring
…lier, update experiment.spec.ts with new IDs
… 17 locale updates
…n* fields optional in global-settings
Description
This PR adds an experiment-gated self-improving subsystem so Zoo can learn from task outcomes, user corrections, and recurring workflow patterns instead of repeatedly making the same corrected mistakes.
Key implementation details in this PR:
SelfImprovingManageras the orchestration layer for learning, review, and prompt-context generation.CuratorService,ReviewPromptFactory,TranscriptRecall,MemoryStore,SkillUsageStore) to keep retained guidance bounded and maintainable over time.PatternAnalyzerto detect recurring patterns in task outcomes and workflow signals.FeedbackCollectorto capture user corrections and feedback during task interactions.ImprovementApplierto apply learned improvements derived from analyzed patterns and feedback.CodeIndexAdapterto integrate code index search hits into the self-improving pipeline, enriching learning signals with codebase context.ActionExecutorto execute improvement actions, including skill mutations (auto-create/update skills based on learned patterns).selfImprovingexperiment flag and surfaces status through extension state/settings so it remains opt-in and low-risk.ARCHITECTURE.mdandGAP_ANALYSIS.md) in the service directory to guide future development and review.AgentMemoryAdapteras an optional memory backend that connects to the agentmemory REST API (https://github.com/rohitg00/agentmemory), providing 124+ memory endpoints and 53 MCP tools as an alternative to the built-inMemoryStore.MemoryBackendinterface andMemoryBackendFactoryso the system can transparently switch between built-in file-based memory and agentmemory's REST API backend via thememoryBackendconfiguration option.Design choices / trade-offs reviewers should note:
MemoryBackendinterface — users can choose between the built-in file-basedMemoryStore(default, zero external dependencies) or theagentmemoryREST API backend for advanced semantic search, consolidation, and cross-session memory features.Video/Screenshot
self-improving-zoo-code.mp4
Commit History
The branch contains 22 commits spanning feature implementation, hardening, and test coverage:
a40bdc15— feat: implement Self-Improving Manager for adaptive learning98057936— feat: Enhance SelfImprovingManager with MemoryStore and SkillUsageStore integrationc6d0b55c— feat: Implement ReviewPromptFactory and TranscriptRecall for self-improvement445e9ede— feat: Enhance self-improvement system with user message tracking and settings managementf15a9a6b— feat: Implement memory backend system with AgentMemoryAdapter and MemoryBackendFactoryfb39c36c— fix: harden self-improving memory deletion and recall9f2688ff— fix: tighten self-improving pattern merges53490f1b— test: tighten self-improving regression coveragef1210f7e— fix: harden transcript recall loading327a7d34— refactor: align self-improving shared typesf90b8845— fix: refine self-improving memory search and scoringe38aad81— fix: serialize transcript recall lazy initialization7860dc10— fix: preserve substring text in agentmemory forget search4f729743— fix: capture interactive user corrections for self-improving29af44c3— fix: feed code index search hits into self-improvingf6581653— test: verify extension startup initializes self-improvingf672d566— feat: add skill mutation APIs and auto-skill toggle732be8ff— feat: wire self-improving auto skill actions02ed3aed— feat: expose auto-skill toggle in experimental settingsf6d34080— test: cover auto-skill experiment gating7cf030e5— Add configurable self-improving memory backend settings99ce4f29— [verified] Add workspace/global self-improving scope controlsIntegration Points
Files modified outside the service folder to wire the self-improving subsystem into the extension:
src/extension.ts— initializes theSelfImprovingManageron extension startupsrc/core/prompts/system.ts— injects learned guidance context into system prompts when the experiment is enabledsrc/core/webview/ClineProvider.ts— exposes self-improving status, settings, and state through the providersrc/core/webview/webviewMessageHandler.ts— refreshes runtime behavior on settings updates (toggle, scope, auto-skill, memory backend)packages/types/src/vscode-extension-host.ts— adds self-improving state/status typing to the extension host typeswebview-ui/src/components/settings/ExperimentalSettings.tsx— adds experiment checkbox, scope selector, auto-skill toggle, memory-backend selector, and agentmemory URL controlswebview-ui/src/components/settings/SettingsView.tsx— wires self-improving settings UI into the cached state patternComplete List of New Service Files
All under
src/services/self-improving/:SelfImprovingManager.tsLearningStore.tsMemoryStore.tsMemoryBackend.tsMemoryBackendFactory.tsAgentMemoryAdapter.tsSkillUsageStore.tsTranscriptRecall.tsCuratorService.tsReviewPromptFactory.tsActionExecutor.tsImprovementApplier.tsPatternAnalyzer.tsFeedbackCollector.tsCodeIndexAdapter.tstypes.tsARCHITECTURE.mdGAP_ANALYSIS.md__tests__/Test Procedure
selfImprovingexperiment in Zoo Code's experimental settings.SelfImprovingManagerLearningStoreMemoryStoreSkillUsageStoreCuratorServiceTranscriptRecallActionExecutorReviewPromptFactoryAgentMemoryAdapterMemoryBackendFactoryPatternAnalyzerFeedbackCollectorImprovementApplierCodeIndexAdapternpx agentmemory) and configure thememoryBackendoption to"agentmemory"to verify the REST API adapter connects and stores/recalls memory correctly.Environment:
selfImprovingexperiment enabled for feature-path validationPre-Submission Checklist
Documentation Updates
Recommended documentation updates:
selfImprovingexperiment does.Additional Notes
I did not find a directly matching open issue/PR in
Zoo-Code-Org/Zoo-Codefor this self-improving learning system. The issue is the critical product gap itself: Zoo currently has no bounded mechanism to retain lessons from corrected failures, repeated task outcomes, or recurring workflow signals, which makes it more likely to repeat avoidable mistakes over time. This PR is the implementation path toward solving that gap with an experiment-gated, bounded, and reversible design.memoryBackendoption. When set to"agentmemory", the system connects to a running agentmemory server athttp://localhost:4001(configurable). The adapter uses REST API calls (not npm imports) so no additional dependencies are required.Get in Touch
Discord: @iskandarsulaili
Summary by CodeRabbit
New Features
Documentation