Summary
Add an opt-in test-only agent that drives the operator-facing flows of psychological-operations end-to-end against a fully simulated X — the X App setup wizard on a fake console.x.com, the per-psyop OAuth handshake on a fake x.com, and For-You scrolling on a fake x.com timeline. The integration-test harness already mocks the X v2 API at the HTTP layer (psychological-operations-cli/src/x/mock.rs, gated by PSYCHOLOGICAL_OPERATIONS_MOCK_X_API), but every browser-side flow still requires a human. This agent closes that gap so the run loop can be exercised in CI without an operator and without ever contacting real X.
TOS boundary (load-bearing)
This agent must never run against real X. Automating browser interactions on x.com / console.x.com violates X's Terms of Service. Using this agent against any real X surface is the operator's problem, not a supported mode.
The implementation enforces this with three independent guards. Any one missing aborts the agent at startup:
- Env-var gate.
PSYCHOLOGICAL_OPERATIONS_TEST_AGENT=1 must be set explicitly. Default-off; never set in release configs. Pairs with PSYCHOLOGICAL_OPERATIONS_MOCK_X_API=1 (also required — the agent refuses to launch when the X API is hitting the real network).
- Cargo feature gate. Wrap the agent module in
#[cfg(feature = "test-agent")]. Don't enable in the default feature set; release builds in psychological-operations-cli/install.sh and .github/workflows/release.yml must omit it. Code paths that route browser navigation are also cfg-gated so the agent module simply doesn't exist in shipped binaries.
- URL allowlist. At launch, the agent inspects every URL it's about to drive Chromium toward. If the host resolves to anything outside the local mock origins (loopback,
*.test, or whatever the mock harness uses), the agent panics with a clear "real X detected, refusing to drive" message. Belt-and-suspenders to catch a misconfiguration where the mock didn't actually start.
The README and the agent's own startup banner say plainly that this is for the mock harness only and that anyone pointing it at real X is on their own.
What the operator does today
All three flows live in Chromium profiles managed by the CLI:
- X App setup (
psychological-operations-cli/src/x_app/setup.rs:30-66). CLI spawns Chromium with the auth extension loaded, lands on https://console.x.com/. Operator signs in, creates a Project + App, configures user-auth settings (Web App, Read+Write, callback http://127.0.0.1/callback), copies Client ID + Client Secret + Bearer Token, pastes into the auth extension's popup form, clicks Save.
- Per-psyop OAuth handshake (
psychological-operations-cli/src/oauth/setup.rs). CLI launches per-psyop Chromium profile, opens X's authorize URL with the PKCE challenge, binds a localhost callback listener. Operator signs into X with the account this psyop should act as, clicks the Authorize button on the OAuth consent screen, callback resolves, tokens land in ~/.psychological-operations/tokens/<name>.json.
- For-You scrolling. Operator runs
psyops browse <name>, the scrape extension's content script (psychological-operations-chromium-extension-scrape/content_script.js) walks the For-You DOM as the operator scrolls, sends tweet IDs over native messaging into for_you_queue.
What the test agent does
For each flow, the agent steps in for the operator:
- X App setup — drives the mock
console.x.com wizard end-to-end: clicks Create Project, fills in the app form, configures user-auth settings (writes the 127.0.0.1/callback URL, picks Web App + Read+Write), copies the synthetic Client ID / Client Secret / Bearer Token from the mock keys-and-tokens page, pastes them into the auth extension's popup, hits Save. Verifies ~/.psychological-operations/x_app.json lands on disk with the expected values.
- Per-psyop OAuth handshake — on the mock
x.com, signs in as the psyop's pre-seeded test account (mock auth — no real password roundtrip), clicks Authorize on the OAuth consent screen, lets the localhost callback resolve. Verifies tokens/<name>.json is written with a valid access token + refresh token.
- For-You scrolling — drives the mock For-You feed by scrolling the
<main> element through N pre-seeded tweet articles. Each article matches the DOM shape the scrape extension's selector expects (article[data-testid="tweet"] with a /<handle>/status/<id> permalink). Verifies all N IDs land in for_you_queue.
The agent uses the Chrome DevTools Protocol (the existing chromium binary already speaks CDP — no extra dependency). Driving the page deterministically via CDP keeps tests fast and reproducible. No LLM in the loop for v1; this is plain UI scripting.
Mock-site requirements (prerequisite)
The agent depends on a mock web frontend that doesn't exist yet. v1 of this issue covers the agent itself plus a minimal mock; the mock can grow over time. Required surfaces:
- A static
console.x.com simulation served by an in-process HTTP server (started alongside the existing mock.rs HTTP layer). Pages: project list, project create form, app dashboard, user-auth settings, keys-and-tokens. Just enough DOM for the agent to click through.
- A static
x.com simulation with two routes: an OAuth /i/oauth2/authorize consent page (renders an Authorize button that POSTs back to the registered callback with a synthetic code), and a /home route that renders pre-seeded For-You articles in the same DOM shape the scrape extension expects.
- DNS / hostname plumbing so Chromium resolves these synthetic origins. Easiest path:
--host-resolver-rules=MAP console.x.com 127.0.0.1:NNNN, MAP x.com 127.0.0.1:NNNN plus --ignore-certificate-errors for the test profile only.
Opt-in surface (CLI-side)
A new subcommand: psychological-operations test-agent <flow> [args]. Available flows: x-app-setup, psyop-oauth <name>, for-you-scroll <name> --tweets <N>. Each flow blocks until verification passes, then exits 0/non-0. Invocations are stitched together by the integration-test harness; no special integration with the run loop itself.
The subcommand is registered only when the test-agent cargo feature is enabled. Without the feature flag the subcommand simply isn't in psychological-operations --help.
Acceptance criteria
Files
psychological-operations-cli/src/test_agent/ (new module, #[cfg(feature = "test-agent")]) — the CDP driver, per-flow scripts, the URL-allowlist check.
psychological-operations-cli/src/x/mock.rs — extend to also serve the mock console.x.com and x.com static pages alongside the API mock, gated on the same env var.
psychological-operations-cli/Cargo.toml — add test-agent feature; pull in a minimal CDP client crate (chromiumoxide or similar). Feature is not in default.
psychological-operations-cli/src/run.rs — register the test-agent subcommand under #[cfg(feature = "test-agent")].
psychological-operations-cli/tests/test_agent_e2e.rs (new) — the stitched end-to-end integration test.
psychological-operations-chromium-extension-auth/manifest.json and psychological-operations-chromium-extension-scrape/manifest.json — extend host_permissions to cover the mock origins (loopback / *.test) only when a test build is loaded; production builds keep the current production-only host list.
README.md — Testing section with the TOS warning and how to invoke the agent.
Out of scope
- LLM-driven UI adaptation. The agent uses fixed selectors against a mock the test harness controls; there's no need for a model in the loop.
- Driving real X. Don't add it. Anyone who wants automation against real X is welcome to fork; this repo doesn't ship that.
- Replacing the operator-driven flows for production users. The X App setup wizard, per-psyop OAuth, and human-driven For-You scrolling remain the production paths. The agent is purely test infrastructure.
- Mock-site fidelity beyond what the agent's selectors need. The mock pages are a stub, not a faithful reproduction of X's UI. They evolve only as the test agent's selectors evolve.
Summary
Add an opt-in test-only agent that drives the operator-facing flows of
psychological-operationsend-to-end against a fully simulated X — the X App setup wizard on a fakeconsole.x.com, the per-psyop OAuth handshake on a fakex.com, and For-You scrolling on a fakex.comtimeline. The integration-test harness already mocks the X v2 API at the HTTP layer (psychological-operations-cli/src/x/mock.rs, gated byPSYCHOLOGICAL_OPERATIONS_MOCK_X_API), but every browser-side flow still requires a human. This agent closes that gap so the run loop can be exercised in CI without an operator and without ever contacting real X.TOS boundary (load-bearing)
This agent must never run against real X. Automating browser interactions on
x.com/console.x.comviolates X's Terms of Service. Using this agent against any real X surface is the operator's problem, not a supported mode.The implementation enforces this with three independent guards. Any one missing aborts the agent at startup:
PSYCHOLOGICAL_OPERATIONS_TEST_AGENT=1must be set explicitly. Default-off; never set in release configs. Pairs withPSYCHOLOGICAL_OPERATIONS_MOCK_X_API=1(also required — the agent refuses to launch when the X API is hitting the real network).#[cfg(feature = "test-agent")]. Don't enable in the default feature set; release builds inpsychological-operations-cli/install.shand.github/workflows/release.ymlmust omit it. Code paths that route browser navigation are alsocfg-gated so the agent module simply doesn't exist in shipped binaries.*.test, or whatever the mock harness uses), the agent panics with a clear "real X detected, refusing to drive" message. Belt-and-suspenders to catch a misconfiguration where the mock didn't actually start.The README and the agent's own startup banner say plainly that this is for the mock harness only and that anyone pointing it at real X is on their own.
What the operator does today
All three flows live in Chromium profiles managed by the CLI:
psychological-operations-cli/src/x_app/setup.rs:30-66). CLI spawns Chromium with the auth extension loaded, lands onhttps://console.x.com/. Operator signs in, creates a Project + App, configures user-auth settings (Web App, Read+Write, callbackhttp://127.0.0.1/callback), copies Client ID + Client Secret + Bearer Token, pastes into the auth extension's popup form, clicks Save.psychological-operations-cli/src/oauth/setup.rs). CLI launches per-psyop Chromium profile, opens X's authorize URL with the PKCE challenge, binds a localhost callback listener. Operator signs into X with the account this psyop should act as, clicks the Authorize button on the OAuth consent screen, callback resolves, tokens land in~/.psychological-operations/tokens/<name>.json.psyops browse <name>, the scrape extension's content script (psychological-operations-chromium-extension-scrape/content_script.js) walks the For-You DOM as the operator scrolls, sends tweet IDs over native messaging intofor_you_queue.What the test agent does
For each flow, the agent steps in for the operator:
console.x.comwizard end-to-end: clicks Create Project, fills in the app form, configures user-auth settings (writes the127.0.0.1/callbackURL, picks Web App + Read+Write), copies the synthetic Client ID / Client Secret / Bearer Token from the mock keys-and-tokens page, pastes them into the auth extension's popup, hits Save. Verifies~/.psychological-operations/x_app.jsonlands on disk with the expected values.x.com, signs in as the psyop's pre-seeded test account (mock auth — no real password roundtrip), clicks Authorize on the OAuth consent screen, lets the localhost callback resolve. Verifiestokens/<name>.jsonis written with a valid access token + refresh token.<main>element through N pre-seeded tweet articles. Each article matches the DOM shape the scrape extension's selector expects (article[data-testid="tweet"]with a/<handle>/status/<id>permalink). Verifies all N IDs land infor_you_queue.The agent uses the Chrome DevTools Protocol (the existing chromium binary already speaks CDP — no extra dependency). Driving the page deterministically via CDP keeps tests fast and reproducible. No LLM in the loop for v1; this is plain UI scripting.
Mock-site requirements (prerequisite)
The agent depends on a mock web frontend that doesn't exist yet. v1 of this issue covers the agent itself plus a minimal mock; the mock can grow over time. Required surfaces:
console.x.comsimulation served by an in-process HTTP server (started alongside the existingmock.rsHTTP layer). Pages: project list, project create form, app dashboard, user-auth settings, keys-and-tokens. Just enough DOM for the agent to click through.x.comsimulation with two routes: an OAuth/i/oauth2/authorizeconsent page (renders an Authorize button that POSTs back to the registered callback with a synthetic code), and a/homeroute that renders pre-seeded For-You articles in the same DOM shape the scrape extension expects.--host-resolver-rules=MAP console.x.com 127.0.0.1:NNNN, MAP x.com 127.0.0.1:NNNNplus--ignore-certificate-errorsfor the test profile only.Opt-in surface (CLI-side)
A new subcommand:
psychological-operations test-agent <flow> [args]. Available flows:x-app-setup,psyop-oauth <name>,for-you-scroll <name> --tweets <N>. Each flow blocks until verification passes, then exits 0/non-0. Invocations are stitched together by the integration-test harness; no special integration with the run loop itself.The subcommand is registered only when the
test-agentcargo feature is enabled. Without the feature flag the subcommand simply isn't inpsychological-operations --help.Acceptance criteria
psychological-operations test-agent x-app-setupwrites a validx_app.jsonto a per-test data dir without operator intervention.psychological-operations test-agent psyop-oauth <name>writes a validtokens/<name>.jsonwith both access and refresh tokens.psychological-operations test-agent for-you-scroll <name> --tweets 50ends with exactly 50 unique IDs infor_you_queuefor the given psyop.psyops run --name <name>end-to-end against the mock and asserts the expected delivery_queue rows exist.test-agentfeature contains no agent code (verified bynm | grep test_agentreturning empty, or equivalent on Windows).Files
psychological-operations-cli/src/test_agent/(new module,#[cfg(feature = "test-agent")]) — the CDP driver, per-flow scripts, the URL-allowlist check.psychological-operations-cli/src/x/mock.rs— extend to also serve the mockconsole.x.comandx.comstatic pages alongside the API mock, gated on the same env var.psychological-operations-cli/Cargo.toml— addtest-agentfeature; pull in a minimal CDP client crate (chromiumoxideor similar). Feature is not in default.psychological-operations-cli/src/run.rs— register thetest-agentsubcommand under#[cfg(feature = "test-agent")].psychological-operations-cli/tests/test_agent_e2e.rs(new) — the stitched end-to-end integration test.psychological-operations-chromium-extension-auth/manifest.jsonandpsychological-operations-chromium-extension-scrape/manifest.json— extendhost_permissionsto cover the mock origins (loopback /*.test) only when a test build is loaded; production builds keep the current production-only host list.README.md— Testing section with the TOS warning and how to invoke the agent.Out of scope