Problem
Outbound SMS, iMessage, and email sends can be queued successfully, then fail later via asynchronous delivery webhooks after the agent loop has already completed and gone idle. When the failed outbound message belongs to an active/known thread, the plugin should wake the agent and tell it exactly what failed so it can recover gracefully.
Current state in this repo
text.delivery_failed, imessage.delivery_failed, message.bounced, and message.failed handlers exist if webhooks arrive.
- The setup/subscription path is narrower and does not currently subscribe to those failure events.
- The wake-up path uses a consult/side-effect turn, so the agent's final text is not automatically sent back on the original channel/thread. The agent must manually use tools for visible recovery.
- Existing handling does not consistently preserve the original SMS/iMessage conversation id or email thread id in the recovery route.
Fleet standard
Implement the same behavior across Claude Code, Codex, Hermes, and OpenClaw plugins:
- Wake the agent only for hard failed outbound delivery events:
- SMS:
text.delivery_failed
- iMessage:
imessage.delivery_failed
- Email:
message.bounced, message.failed
- Do not wake on
text.delivery_unconfirmed; that is telemetry/status uncertainty, not a hard failed-delivery recovery signal.
- Track outbound delivery context when a send queues successfully, keyed by provider/Inkbox message id where available. Store channel, contact/session key, recipient, original body snippet, SMS/iMessage conversation id, and email thread/subject metadata.
- When a failure webhook arrives, correlate it to the original thread using outbound context first, then webhook contact/thread/recipient fallback. If no usable thread/session can be resolved, log and do not wake.
- Wake the agent with a synthetic recovery turn whose final response is sent on the same channel/thread by default.
- The prompt must explain that the previous outbound message failed, include reason/error details and the failed message body when available, and tell the agent it may modify/shorten/retry, use tools to switch channel, or reply exactly
[SILENT] to do nothing visible.
- Deduplicate repeated failure webhooks by channel + event type + message id, with a payload-hash fallback when no id is present.
- Add loop protection so failed recovery sends do not cause unbounded retry loops.
Acceptance criteria
- Subscriptions include the hard failure events listed above.
- Unit tests cover SMS, iMessage, and email failure webhooks.
- Tests prove a correlated failure wakes the right session/thread.
- Tests prove recovery output is delivered on the same channel/thread by default.
- Tests prove exact
[SILENT] suppresses visible delivery.
- Tests prove
text.delivery_unconfirmed does not wake the agent.
- Tests prove duplicate failure webhooks do not trigger duplicate recovery turns.
Notes
This should be coordinated with the fleet standardization branch and tracked in admin-console/docs/PLUGIN_FLEET.md.
Problem
Outbound SMS, iMessage, and email sends can be queued successfully, then fail later via asynchronous delivery webhooks after the agent loop has already completed and gone idle. When the failed outbound message belongs to an active/known thread, the plugin should wake the agent and tell it exactly what failed so it can recover gracefully.
Current state in this repo
text.delivery_failed,imessage.delivery_failed,message.bounced, andmessage.failedhandlers exist if webhooks arrive.Fleet standard
Implement the same behavior across Claude Code, Codex, Hermes, and OpenClaw plugins:
text.delivery_failedimessage.delivery_failedmessage.bounced,message.failedtext.delivery_unconfirmed; that is telemetry/status uncertainty, not a hard failed-delivery recovery signal.[SILENT]to do nothing visible.Acceptance criteria
[SILENT]suppresses visible delivery.text.delivery_unconfirmeddoes not wake the agent.Notes
This should be coordinated with the fleet standardization branch and tracked in
admin-console/docs/PLUGIN_FLEET.md.