feat(agent): schedule tasks for the next available AI coworker (SQLite-backed)#647
Conversation
… read-only Address review findings: treat task content as untrusted data not instructions (framing + closed kind vocabulary + XML-escaping + session-name sanitization), never expire/evict in_progress tasks, rewrite-free read path on the prompt hook and ox status, resolved agent type for target surfacing, busy-agent guard, cursor reset on /clear and /compact, in-process mutex, and task size caps. https://claude.ai/code/session_01B3Hhcw2zYFUTsQaifkvifk
…, eviction order, single-snapshot count) - emitAgentTasks/status/doctor guard on QueueExists so a read never MkdirAll's the queue dir - writeTaskCursor writes atomically (tmp+rename) - enforceActiveCap evicts least-urgent (priority desc) before oldest - runTasksList computes the ready count from one snapshot, not a second locked read - add QueuePath/QueueExists helpers (no more hardcoded queue paths) https://claude.ai/code/session_01B3Hhcw2zYFUTsQaifkvifk
Co-Authored-By: SageOx <ox@sageox.ai>
|
Ready to act? Review this PR in Change Stack to turn feedback into patch suggestions you can inspect and refine. 📝 WalkthroughWalkthroughAdds a per-repo SQLite agent task queue, agent CLI and hidden producer surface, prompt-surfacing with per-agent throttling, daemon producers for doctor/session-finalize tasks, manager integration, status/doctor checks for stuck tasks, schema + Store implementation, and comprehensive tests and docs. ChangesAgent Task Scheduling System
Estimated code review effort 🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Greptile SummaryThis PR replaces a hand-rolled JSONL task store with an embedded SQLite backend for the agent task-scheduling queue, and wires up the daemon, doctor, prime-time surfacing, and per-prompt hook to drive work to the next available live AI coworker instead of a detached
Confidence Score: 4/5Safe to merge with one race in the doctor fix path worth resolving before long-term reliance on the 'no live work canceled' guarantee. The SQLite store is well-designed and the three prior review blockers are correctly fixed. One remaining race exists in checkAgentTasksStuck: the poison list is built from a store.List snapshot that excludes in-progress tasks, but between that snapshot and the subsequent store.Cancel calls, a newly claimed in-progress task can slip through — terminate's WHERE clause (status NOT IN completed/canceled) matches in-progress, so the live agent's work gets aborted. The PR description states this behavior is fixed, but the fix is incomplete. The impact is bounded (wasted work on a task that has already failed 5+ times, no data loss), and the race window is narrow, but it contradicts a stated guarantee of the change. cmd/ox/doctor_agent_tasks.go (the Cancel loop after the poison list is built) and internal/agenttask/store.go (the terminate WHERE clause, which is intentionally permissive for the general case but creates the opening in the doctor path). Important Files Changed
Sequence DiagramsequenceDiagram
participant Producer as Daemon Producer
participant DB as agent_tasks.db SQLite WAL
participant Hook as UserPromptSubmit Hook
participant Agent as Live AI Coworker
participant Sub as Fresh-context Subagent
Producer->>DB: Enqueue task with dedup_key
Note over DB: partial unique index blocks duplicates
Hook->>DB: QueueExists check then ListView
DB-->>Hook: ready tasks for this agentType
Hook-->>Agent: system-reminder block if ready set changed
Agent->>DB: Claim with AgentID and PID
Note over DB: UPDATE WHERE status=ready atomic guard
DB-->>Agent: claimed Task
Agent->>Sub: dispatch in fresh context
Sub->>Sub: perform ox action for task kind
Sub->>DB: Complete with result note
DB-->>Sub: ok
Agent->>DB: ExtendLease periodically
Note over DB: reconcile on every read
|
There was a problem hiding this comment.
Actionable comments posted: 11
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/ox/agent_tasks_surface.go`:
- Around line 155-157: taskCursorPath currently concatenates raw agentID into a
filesystem path and is vulnerable to path traversal; modify taskCursorPath to
validate and sanitize agentID before joining: reject or normalize values that
contain path separators or parent-directory segments (e.g., "/", "\", or ".."),
allow only a safe whitelist (e.g., alphanumerics, dot, underscore, hyphen) or
use filepath.Base and then verify it matches the whitelist, and ensure the
returned filename always ends with ".json"; update any callers if you change the
function signature to return an error on invalid agentID.
In `@cmd/ox/agent_tasks_test.go`:
- Around line 3-14: Replace uses of os.IsNotExist(err) in this test file with
errors.Is(err, os.ErrNotExist) to follow the repository's error-unwrapping
convention; specifically, add the "errors" package to the import block and
update the assertions/checks that currently call os.IsNotExist(err) (search for
os.IsNotExist in this file, including the occurrences around where
agenttask/agentinstance tests assert file-not-found) to use errors.Is(err,
os.ErrNotExist) instead.
In `@cmd/ox/doctor_agent_tasks_test.go`:
- Around line 38-50: The tests are swallowing setup/read errors (calls to
agenttask.Enqueue, chdirToGitRepo, agenttask.NewStore, and store.Add) which can
mask failures; update the tests (e.g., in TestCheckAgentTasksStuck_* functions)
to check returned errors and fail fast (use t.Fatalf or t.Fatal) when any of
those calls return a non-nil error so test setup errors are visible and
assertions run reliably.
In `@cmd/ox/doctor_agent_tasks.go`:
- Around line 78-87: The current fix branch silences errors from store.Cancel
and always returns PassedCheck("Agent tasks", ...) even when some cancels
failed; update the loop over poison in doctor_agent_tasks.go to track cancel
failures and errors from store.Cancel (e.g., failedCancels and optionally
collect error messages), increment canceled only on success, and after the loop
return PassedCheck only if failedCancels == 0, otherwise return a
FailedCheck("Agent tasks", ...) that includes counts (canceled vs failed) and
relevant error summaries; reference variables/functions: fix, poison,
store.Cancel, canceled, maxTaskAttempts, PassedCheck, FailedCheck.
In `@docs/ai/specs/agent-task-scheduling.md`:
- Around line 98-105: Replace the ASCII lifecycle diagram block with a Mermaid
state diagram: remove the plain fenced ASCII block and add a ```mermaid
stateDiagram-v2``` block that defines states [*], ready, in_progress, completed,
canceled and transitions ready --> in_progress: claim (next), in_progress -->
completed: done, in_progress --> canceled: cancel, and in_progress --> ready:
reclaim with the label "(lease expired OR claiming PID dead on same host)";
ensure the code fence starts with "```mermaid" and use the state names ready,
in_progress, completed, canceled, and [*] exactly as identifiers so the diagram
renders correctly.
- Around line 31-35: The fenced code blocks in the document lack language
specifications, which violates markdownlint rule MD040. Add the language
identifier `text` after the opening triple backticks for the fenced code block
containing the file paths for `.sageox/agent_tasks/agent_tasks.db`,
`.sageox/agent_tasks/agent_tasks.db-wal`, and
`.sageox/agent_tasks/agent_tasks.db-shm`. Apply the same fix to any other
unlabeled fenced code blocks referenced in the document by specifying an
appropriate language (e.g., `text` for file paths, `mermaid` for diagrams).
In `@internal/agenttask/store_test.go`:
- Around line 233-244: Replace the brittle time.Sleep usage in
TestReclaimExpiredLease by deterministically backdating the task row's
lease_expires_at field to a past time (as other tests do) instead of sleeping:
after Claim(ClaimOptions{AgentID:"Oxdead", PID: os.Getpid(), Lease:
time.Millisecond}) update the claimed task's lease_expires_at in the test
database to time.Now().Add(-someDuration) (e.g. -time.Millisecond) so the lease
is expired, then proceed with the reclaim assertions; apply the same replacement
for the similar sleep-based logic in the other test block referenced below (the
one around the later Claim/reclaim assertions).
- Around line 11-23: newTestStore currently opens a Store via NewStore(root) but
never closes it; register a test cleanup by calling t.Cleanup and closing the
Store to avoid leaking DB descriptors. After creating store in newTestStore, add
t.Cleanup(func() { _ = store.Close() }) (or fail the test on close error) so the
Store's resources are released; reference the newTestStore helper, the NewStore
constructor and the Store.Close method in your change.
In `@internal/agenttask/store.go`:
- Around line 64-70: The current QueueExists function treats any os.Stat error
as "no queue"; change its signature to QueueExists(projectRoot string) (bool,
error) and update its implementation to call os.Stat(QueuePath(projectRoot)),
then: if projectRoot == "" return false, nil; if err == nil return true, nil; if
errors.Is(err, os.ErrNotExist) return false, nil; otherwise return false, err —
using errors.Is(err, os.ErrNotExist) (not os.IsNotExist) to distinguish
missing-file from permission/IO failures and surface operational errors instead
of silently returning false. Ensure you import the errors package.
- Around line 190-193: When enqueueing tasks in Add (and the similar
insert/update block later), ignore any caller-provided status/claim metadata:
always set task.Status = StatusReady and clear claim-related fields
(lease_expires_at / LeaseExpiresAt and host/claimant fields) before writing to
the DB so a newly inserted task is immediately claimable and not permanently
wedged; apply this normalization both where task.Status is currently set and in
the later insert/upsert code path (the same code blocks referencing task.Status
and the claim/lease fields).
In `@internal/daemon/agentwork/task_producer.go`:
- Around line 14-18: The current safeSessionName regexp only enforces
charset/length but not the documented machine format (timestamp+user+agent-id);
update the safeSessionName definition to a stricter regexp that enforces an
ISO-like timestamp prefix followed by the user and agent-id components (e.g.
YYYY-MM-DDTHH-MM-SS or YYYY-MM-DDTHH-MM) and the expected separators, while
still limiting total length — replace the existing safeSessionName regexp with
one that matches the timestamp-at-start + hyphen + user + hyphen + agent-id
shape and apply it where safeSessionName is used (e.g., the validation that
rejects session names before they are embedded/parsed).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 97ce7895-5096-4c83-8857-7b0f5e5cf8c7
📒 Files selected for processing (32)
.sageox/.gitignorecmd/ox/agent.gocmd/ox/agent_hook.gocmd/ox/agent_session_pause.gocmd/ox/agent_tasks.gocmd/ox/agent_tasks_surface.gocmd/ox/agent_tasks_test.gocmd/ox/doctor.gocmd/ox/doctor_agent.gocmd/ox/doctor_agent_tasks.gocmd/ox/doctor_agent_tasks_test.gocmd/ox/doctor_sageox.gocmd/ox/doctor_types.gocmd/ox/hooks_post_rewrite_test.gocmd/ox/init_gitignore_test.gocmd/ox/status.godocs/ai/specs/agent-task-scheduling.mdinternal/agenttask/schema.gointernal/agenttask/schema.sqlinternal/agenttask/store.gointernal/agenttask/store_regression_test.gointernal/agenttask/store_test.gointernal/agenttask/task.gointernal/daemon/agentwork/manager.gointernal/daemon/agentwork/manager_test.gointernal/daemon/agentwork/session_finalize_orphan_test.gointernal/daemon/agentwork/task_producer.gointernal/daemon/agentwork/task_producer_test.gointernal/daemon/daemon.gointernal/daemon/daemon_pending_work_test.gointernal/doctor/markers.gointernal/markers/markers.go
Co-Authored-By: SageOx <ox@sageox.ai>
…ema versioning) Co-Authored-By: SageOx <ox@sageox.ai>
There was a problem hiding this comment.
♻️ Duplicate comments (1)
docs/ai/specs/agent-task-scheduling.md (1)
34-38:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd
textlanguage identifier for MD040 compliance.The fenced code block showing file paths still lacks a language specifier, triggering markdownlint MD040.
📝 Proposed fix
-``` +```text .sageox/agent_tasks/agent_tasks.db # embedded SQLite; gitignored, ephemeral, local-only .sageox/agent_tasks/agent_tasks.db-wal # WAL sidecar .sageox/agent_tasks/agent_tasks.db-shm</details> <details> <summary>🤖 Prompt for AI Agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.In
@docs/ai/specs/agent-task-scheduling.mdaround lines 34 - 38, The fenced code
block showing SQLite filenames is missing a language identifier which triggers
markdownlint MD040; update the block (the triple-backtick fence used around the
file paths in agent-task-scheduling.md) to include the "text" language specifier
so the block starts with ```text and the file paths remain unchanged.</details> <!-- cr-comment:v1:3673b94edf53f32ba61d9b70 --> </blockquote></details> </blockquote></details> <details> <summary>🧹 Nitpick comments (1)</summary><blockquote> <details> <summary>internal/agenttask/store_regression_test.go (1)</summary><blockquote> `201-201`: _💤 Low value_ **Consider using `t.Cleanup` for consistency.** Line 218 uses `t.Cleanup` to close store2, but line 201 closes store directly. For consistency, both could use `t.Cleanup`. This ensures resources are cleaned up even if the test fails between the close and the end. <details> <summary>♻️ Suggested refactor</summary> ```diff if _, err := store.Add(&Task{Title: "from old schema"}); err != nil { t.Fatalf("Add: %v", err) } - store.Close() + t.Cleanup(func() { _ = store.Close() }) ``` </details> Also applies to: 218-218 <details> <summary>🤖 Prompt for AI Agents</summary> ``` Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/agenttask/store_regression_test.go` at line 201, Replace the direct call to store.Close() with a t.Cleanup registration to match the pattern used for store2; specifically, instead of calling store.Close() inline, register t.Cleanup(func() { store.Close() }) so the store is always closed when the test finishes (even on failure) and mirrors the handling for store2. ``` </details> <!-- cr-comment:v1:f6e21f3ba7cc9aaa3c66712d --> </blockquote></details> </blockquote></details> <details> <summary>🤖 Prompt for all review comments with AI agents</summary>Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.Duplicate comments:
In@docs/ai/specs/agent-task-scheduling.md:
- Around line 34-38: The fenced code block showing SQLite filenames is missing a
language identifier which triggers markdownlint MD040; update the block (the
triple-backtick fence used around the file paths in agent-task-scheduling.md) to
include the "text" language specifier so the block starts with ```text and the
file paths remain unchanged.
Nitpick comments:
In@internal/agenttask/store_regression_test.go:
- Line 201: Replace the direct call to store.Close() with a t.Cleanup
registration to match the pattern used for store2; specifically, instead of
calling store.Close() inline, register t.Cleanup(func() { store.Close() }) so
the store is always closed when the test finishes (even on failure) and mirrors
the handling for store2.</details> --- <details> <summary>ℹ️ Review info</summary> <details> <summary>⚙️ Run configuration</summary> **Configuration used**: defaults **Review profile**: CHILL **Plan**: Pro Plus **Run ID**: `b7be9839-6dac-4717-82b6-2037d02d2a8d` </details> <details> <summary>📥 Commits</summary> Reviewing files that changed from the base of the PR and between 04555512ff2a88da6ba0d9391d8722a7e152a93a and dadf95646c7952a1122eb5d99af40053719acf55. </details> <details> <summary>📒 Files selected for processing (10)</summary> * `cmd/ox/agent_tasks_surface.go` * `cmd/ox/agent_tasks_test.go` * `cmd/ox/doctor_agent_tasks.go` * `cmd/ox/doctor_agent_tasks_test.go` * `docs/ai/specs/agent-task-scheduling.md` * `internal/agenttask/schema.go` * `internal/agenttask/store.go` * `internal/agenttask/store_regression_test.go` * `internal/agenttask/store_test.go` * `internal/daemon/agentwork/task_producer.go` </details> <details> <summary>🚧 Files skipped from review as they are similar to previous changes (7)</summary> * cmd/ox/agent_tasks_surface.go * cmd/ox/doctor_agent_tasks_test.go * cmd/ox/doctor_agent_tasks.go * internal/daemon/agentwork/task_producer.go * cmd/ox/agent_tasks_test.go * internal/agenttask/store_test.go * internal/agenttask/store.go </details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
…rd + status JSON Co-Authored-By: SageOx <ox@sageox.ai>
Summary
Adds an internal task-scheduling queue so the next available AI coworker runs machine-scheduled chores over a developer's own existing sessions and data — finalizing incomplete recordings, summarizing recorded sessions, anti-entropy — in the live session where that context is already loaded, instead of the daemon spinning up a separate
claude -pbackground worker detached from the developer's session.flowchart TB subgraph PROD ["Producers"] direction TB D["daemon anti-entropy"] M[".needs-doctor-agent marker"] C["ox agent tasks add / Go API"] end D --> Q M --> Q C --> Q Q[("agent_tasks.db<br/>embedded SQLite (WAL)")] Q -- "UserPromptSubmit hook, surface only on change" --> A["live AI coworker"] A -- "tasks next (atomic claim + lease)" --> A A -- "dispatch" --> S["fresh-context subagent"] S -- "tasks done" --> QStorage — embedded SQLite
The access pattern (claim-top-eligible, filter by
target_agent/status, lease-expiry reclaim, prune) is a relational workload. Backing it on a flat JSONL file meant hand-rolling atomicity, dedup, and the active cap — and getting a class of races. Re-backed onmodernc.org/sqlite(pure-Go, no cgo — already a dependency; the same idiom asinternal/whisperandinternal/codedb):UPDATE ... WHERE id=? AND status='ready'; a racing claimer affects zero rows and moves on. Double-claim is impossible by construction.dedup_keyover non-terminal rows; concurrent producers can't double-enqueue.Addcan't bypass the cap.UPDATE/DELETEs.busy_timeout+_txlock=immediate: concurrent readers (hook,ox status) run during a writer's claim; concurrent writers serialize instead of deadlocking.agent_tasks/covers.db/-wal/-shm). NFS unsupported (same local assumption the JSONL store carried).Review blockers fixed (storage swap eliminated the rest by construction)
shortID()replaces uncheckedid[:8]slices (corrupt/short row no longer panics list/claim/doctor).--fixno longer cancels live work — poison predicate now excludes livein_progresstasks; respectstasks extend.claimed_host == this host; an empty or foreign host relies on lease expiry (a PID is meaningless cross-host).loadConfig()snapshot, closing adisabled to enabledrace that could double-run session finalize..needs-doctor-agentmoved to a leaf packageinternal/markers(single source of truth) so a rename can't silently disable doctor-task production across thedoctor/agentworkimport boundary.Command surface
ox agent <id> tasks list | next | done | cancel | extend, plus a hiddentasks addfor producers. Task counts render inox status(human +--json) andox agent list;ox doctorgains anagent-tasks-stuckself-heal + poison-cancel check.Surfacing — every adapter, no silent queue
Tasks reach an agent through two complementary channels plus pull:
flowchart TB Q[("agent_tasks.db")] --> PRIME["prime-time: ox agent prime<br/>surfaces ready count at session start"] Q --> PUSH["mid-session push: UserPromptSubmit hook<br/>emitAgentTasks, throttled"] Q --> PULL["pull: ox agent id tasks next"] PRIME --> ALL["ALL adapters (every adapter runs prime)"] PUSH --> CC["claude-code, codex (verified injectable per-prompt hook)"] PULL --> ALL2["ALL adapters, any time"]ox agent primeat session start (Claude via hook, OpenCode via plugin, amp/pi/aider via their marker). Prime now counts ready tasks for this agent type and emits a high-priority action to claim + run each in a subagent. This is the universal delivery floor — no agent has a silent queue.UserPromptSubmit) and Codex — the hosts with a per-prompt hook whose stdout reaches the model — additionally surface on every prompt, catching work enqueued after prime.target_agent.Security — payload secret guard
payloadis surfaced to the model on claim.Store.Addnow rejects payload keys that look credential-bearing (password/token/secret/api_key/…); the legitimatesessionkey stays allowed.Not in this PR (by construction, not deferral)
Live per-adapter E2E (does Gemini/Codex actually surface on every prompt) requires the real agents and lives in the separate
ox-test-harnessrepo. Adding mid-session push to a third adapter needs that host to expose an injectable per-prompt hook — verifiable only with the live agent. Prime-time surfacing already covers those adapters at session start + pull.Test Plan
go build ./...,go vet ./...,gofmt, and repo lint (-c .config/golangci.yml) all clean.go testgreen forinternal/agenttask,internal/daemon/agentwork,internal/doctor,internal/markers,cmd/ox.Claimyields the task to exactly one; concurrent same-dedup_keyAddkeeps one active row; lease-expired + same-host dead-PID reclaim; empty/foreignclaimed_hostnot PID-checked cross-host;Addrejects non-ready status + sensitive payload keys;ExtendLeaseerrors on a lost claim; schema-version mismatch recreates the DB; prime surfaces ready tasks in<immediate-actions>andcountReadyAgentTaskshonorstarget_agent; daemon producer honors the passed config snapshot.TestGlobalLease*failures ininternal/daemonare environment-driven (global-lease lock path) and fail identically on the base branch — unrelated to this change.Checklist (expand if needed)
ox-test-harnessfollow-up)tasks)agenttaskAPI and all caller sites unchanged/security-review— N/A: nointernal/auth/,internal/mcp/,raw_writer.go,prepush_scan.go,internal/upgrade/, orgo.sumchanges.payloadis surfaced to the model and is now guarded inStore.Add(sensitive-key rejection);taskCursorPathvalidatesagentIDagainst path traversal; task title/kind are surfaced as XML-escaped untrusted data only.Co-authored-by: SageOx ox@sageox.ai