Standardize failed outbound delivery recovery

## Problem

Outbound SMS, iMessage, and email sends can be queued successfully, then fail later via asynchronous delivery webhooks after the agent loop has already completed and gone idle. When the failed outbound message belongs to an active/known thread, the plugin should wake the agent and tell it exactly what failed so it can recover gracefully.

## Current state in this repo

- `text.delivery_failed`, `imessage.delivery_failed`, `message.bounced`, and `message.failed` handlers exist if webhooks arrive.
- The setup/subscription path is narrower and does not currently subscribe to those failure events.
- The wake-up path uses a consult/side-effect turn, so the agent's final text is not automatically sent back on the original channel/thread. The agent must manually use tools for visible recovery.
- Existing handling does not consistently preserve the original SMS/iMessage conversation id or email thread id in the recovery route.

## Fleet standard

Implement the same behavior across Claude Code, Codex, Hermes, and OpenClaw plugins:

1. Wake the agent only for hard failed outbound delivery events:
   - SMS: `text.delivery_failed`
   - iMessage: `imessage.delivery_failed`
   - Email: `message.bounced`, `message.failed`
2. Do not wake on `text.delivery_unconfirmed`; that is telemetry/status uncertainty, not a hard failed-delivery recovery signal.
3. Track outbound delivery context when a send queues successfully, keyed by provider/Inkbox message id where available. Store channel, contact/session key, recipient, original body snippet, SMS/iMessage conversation id, and email thread/subject metadata.
4. When a failure webhook arrives, correlate it to the original thread using outbound context first, then webhook contact/thread/recipient fallback. If no usable thread/session can be resolved, log and do not wake.
5. Wake the agent with a synthetic recovery turn whose final response is sent on the same channel/thread by default.
6. The prompt must explain that the previous outbound message failed, include reason/error details and the failed message body when available, and tell the agent it may modify/shorten/retry, use tools to switch channel, or reply exactly `[SILENT]` to do nothing visible.
7. Deduplicate repeated failure webhooks by channel + event type + message id, with a payload-hash fallback when no id is present.
8. Add loop protection so failed recovery sends do not cause unbounded retry loops.

## Acceptance criteria

- Subscriptions include the hard failure events listed above.
- Unit tests cover SMS, iMessage, and email failure webhooks.
- Tests prove a correlated failure wakes the right session/thread.
- Tests prove recovery output is delivered on the same channel/thread by default.
- Tests prove exact `[SILENT]` suppresses visible delivery.
- Tests prove `text.delivery_unconfirmed` does not wake the agent.
- Tests prove duplicate failure webhooks do not trigger duplicate recovery turns.

## Notes

This should be coordinated with the fleet standardization branch and tracked in `admin-console/docs/PLUGIN_FLEET.md`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Standardize failed outbound delivery recovery #7

Problem

Current state in this repo

Fleet standard

Acceptance criteria

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Standardize failed outbound delivery recovery #7

Description

Problem

Current state in this repo

Fleet standard

Acceptance criteria

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions