Skip to content

Fix workspace bootstrap context delivery for agents#402

Open
kdxcxs wants to merge 6 commits into
openagents-org:developfrom
kdxcxs:fix/agent-bootstrap-context
Open

Fix workspace bootstrap context delivery for agents#402
kdxcxs wants to merge 6 commits into
openagents-org:developfrom
kdxcxs:fix/agent-bootstrap-context

Conversation

@kdxcxs

@kdxcxs kdxcxs commented May 23, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Move OpenAgents workspace identity out of shared skill content and into per-thread bootstrap context.
  • Write agent skills to the correct CLI-specific locations, including Cursor’s .cursor/skills/openagents-workspace/SKILL.md.
  • Persist workspace.agent.bootstrap events for backend-created channels so agents receive runtime context before handling messages.
  • Keep skill content focused on tool usage instructions, with dynamic per-agent/per-thread values supplied through bootstrap context and runtime env.

Why

  • The existing skill location was wrong for Cursor, so the CLI could fail to discover the OpenAgents skill.
  • Skill content is loaded on demand. If OpenAgents context only lives in a skill, the agent may not know it is operating inside an OpenAgents workspace unless the user explicitly mentions it.
  • A skill file can be shared by multiple agents in the same repo, and the same agent can participate in different threads. Per-agent or per-thread identity such as agent name, workspace ID, channel/thread, and runtime role should not be written into one shared skill file.
  • Separating bootstrap context from skills avoids leaking one agent/thread’s identity into another while still giving each session the workspace context it needs up front.

Bootstrap Mechanism

  • The backend emits a workspace.agent.bootstrap event when an agent becomes a participant in a channel, including workspace-created default sessions, channel create/join flows, message-routing auto-joins, and routine channels.
  • The connector polls workspace.* events and processes workspace.agent.bootstrap before regular workspace.message.posted events in the same batch.
  • Each adapter handles bootstrap once per channel/thread by seeding the underlying agent session with OpenAgents runtime context, then stores/resumes that agent session for later user messages.
  • Normal user messages are not prepended with bootstrap text, so workspace context is injected once instead of repeatedly polluting the conversation.
  • Skills remain static tool instructions and use runtime env placeholders for dynamic values such as workspace ID/token/agent name.

Test plan

  • node --test test/bootstrap-event.test.js test/cursor.test.js test/workspace-client.test.js
  • .venv/bin/pytest tests/test_channel_membership.py tests/test_routines.py -q
  • .venv/bin/pytest tests/test_events.py::TestPollEvents::test_poll_accepts_connector_default_limit -q
  • Fresh docker-compose + dev agn run with two Cursor agents; verified cursor-alpha consumed the persisted bootstrap event.

kdxcxs added 3 commits May 23, 2026 14:15
Seed workspace context through dedicated bootstrap events so agent messages stay clean, while keeping tool skills free of dynamic identity and token state.
Use directory-based SKILL.md locations for Claude, OpenCode, and OpenClaw so each adapter matches its skill discovery contract.
Ensure backend-created workspace and routine channels emit bootstrap events so agents receive runtime context before processing work. Also allow the connector's default events poll size.
@vercel

vercel Bot commented May 23, 2026

Copy link
Copy Markdown

@kdxcxs is attempting to deploy a commit to the Raphael's projects Team on Vercel.

A member of the Team first needs to authorize it.

@zomux zomux left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thorough Review — Changes Requested

This is a well-designed PR that fixes a real security issue (shared SKILL.md leaking agent identity across agents). The bootstrap event mechanism is sound and the per-process env var injection is the right approach.

Blocking

Routine channel rename is a breaking change
The channel naming changed from routines:<agent> to routine:<id>. Existing production routines using the old channel names will be orphaned — their events will land in channels nothing is listening to. This needs either:

  • An Alembic migration to rename existing routine channels
  • Backward-compatibility handling that recognizes both patterns
  • Or at minimum, documentation of the migration path for existing deployments

Should Address

1. _emit_agent_bootstrap_event is imported as a private function across modules
routines.py and workspaces.py import _emit_agent_bootstrap_event from workspace_mod.py. If it's part of the module's public API, drop the underscore prefix.

2. Duplicated spawn/process boilerplate
Claude, Cursor, and Gemini adapters each have nearly identical _spawnXxxProcess and bootstrap methods (~50 lines each). Consider consolidating into the base adapter.

Positive Notes

  • Security improvement — identity no longer hardcoded in shared skill files. Per-process env vars and target-filtered bootstrap events are the right approach.
  • Good test coverage — 4 new test files covering bootstrap events, Cursor/Gemini adapters, skill paths, and bootstrap-before-message ordering.
  • No migration files — uses existing EventRecord model, no schema changes.
  • Bootstrap filtering is correcttarget and target_agents are properly checked before delivering events.

CI

  • Ubuntu/macOS tests all pass (Node 18/20/22)
  • agent-smoke and Windows failures are pre-existing infrastructure issues

@kdxcxs

kdxcxs commented May 25, 2026

Copy link
Copy Markdown
Contributor Author

I addressed the public bootstrap emitter naming and added the routine channel migration. I’m leaving the adapter spawn/bootstrap consolidation for a follow-up PR because it touches three CLI adapters and is a structural refactor rather than part of the bootstrap context fix.

@kdxcxs

kdxcxs commented May 25, 2026

Copy link
Copy Markdown
Contributor Author

Longer term, I think an event-driven backend would be a better fit for the workspace system, but we should not jump straight to a full message queue unless the product actually needs it.

The backend already has an events table that represents durable facts, but many side effects are still handled inline. Creating a channel can also emit bootstrap events, routines directly trigger pipeline processing, and adapters
each maintain their own bootstrap/session orchestration. As the system grows, this becomes a halfway event-driven architecture: events exist, but lifecycle work is still spread across routers, mods, scheduler code, and adapter-specific
logic.

A cleaner direction would be to model important facts as durable events:

  • routine.created
  • channel.member_added
  • agent.bootstrap_requested
  • routine.fired

Side effects should then move into dedicated workers or consumers:

  • Creating bootstrap events
  • Dispatching routine messages
  • Notifying adapters
  • Generating channel titles
  • Retrying failed lifecycle work

Adapters could subscribe to explicit lifecycle and message events instead of each adapter carrying more bespoke initialization logic. The database should remain the source of truth, with an event log or outbox providing reliable
delivery and retry semantics.

The best first step is likely a DB-backed outbox plus worker, not Kafka, RabbitMQ, or another full MQ immediately. For example, a Postgres event_outbox table with SKIP LOCKED workers would already provide reliable async processing, retries, and clearer separation of concerns. If throughput or service boundaries later require it, the system can move to Redis Streams, NATS, RabbitMQ, or Kafka.

I think a real event-driven backend also fits serverless architecture better. In a serverless setup, request handlers should ideally stay thin: validate input, persist state changes, and enqueue durable follow-up work. Long-running or failure-prone side effects such as routine firing, bootstrap dispatch, notifications, title generation, and adapter lifecycle coordination can then run in separate workers or triggered functions with their own retry policies. A DB-backed outbox is a good stepping stone here because it keeps the database as the source of truth while still giving us reliable async execution. Later, if we move more of the backend to serverless infrastructure, the same event boundaries can map naturally to queue consumers, scheduled functions, or pub/sub subscribers without rewriting the domain flow.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants