diff --git a/README.md b/README.md
index 5e8862c..6ba10f2 100644
--- a/README.md
+++ b/README.md
@@ -61,6 +61,8 @@ async def main() -> None:
         user = await client.get("/users/1", response_model=User)
 ```
 
+Need a custom middleware (auth, tracing, request-ID propagation, etc.)? See the [Middleware guide](docs/middleware.md).
+
 ### Streaming responses
 
 For large responses or server-sent events, stream the body chunk-by-chunk. `stream()` is an async context manager:
diff --git a/docs/errors.md b/docs/errors.md
new file mode 100644
index 0000000..1ab6836
--- /dev/null
+++ b/docs/errors.md
@@ -0,0 +1,133 @@
+# Errors reference
+
+`httpware` raises typed exceptions automatically — everything inherits `ClientError`, and HTTP responses with 4xx/5xx status raise status-keyed `StatusError` subclasses without you having to call `response.raise_for_status()`.
+
+For the resilience-specific errors (`RetryBudgetExhaustedError`, `BulkheadFullError`) see the [Resilience reference](resilience.md).
+
+## The exception tree
+
+```
+ClientError                          (catch-all for anything httpware raises)
+├── TransportError                   (connection/network/protocol failure pre-response)
+│   └── NetworkError                 (transient — safe to retry; covered by Retry's defaults)
+├── TimeoutError                     (also inherits builtins.TimeoutError — except OSError catches it)
+├── StatusError                      (got a response but its status was 4xx/5xx)
+│   ├── ClientStatusError            (any 4xx — fallback for unknown 4xx codes)
+│   │   ├── BadRequestError          (400)
+│   │   ├── UnauthorizedError        (401)
+│   │   ├── ForbiddenError           (403)
+│   │   ├── NotFoundError            (404)
+│   │   ├── ConflictError            (409)
+│   │   ├── UnprocessableEntityError (422)
+│   │   └── RateLimitedError         (429)
+│   └── ServerStatusError            (any 5xx — fallback for unknown 5xx codes)
+│       ├── InternalServerError     (500)
+│       └── ServiceUnavailableError (503)
+├── RetryBudgetExhaustedError       (a retry was needed but the budget refused)
+└── BulkheadFullError                (acquire_timeout elapsed before a slot opened)
+```
+
+## Status-to-exception mapping
+
+| Status | Exception class |
+|---|---|
+| 400 | `BadRequestError` |
+| 401 | `UnauthorizedError` |
+| 403 | `ForbiddenError` |
+| 404 | `NotFoundError` |
+| 409 | `ConflictError` |
+| 422 | `UnprocessableEntityError` |
+| 429 | `RateLimitedError` |
+| 500 | `InternalServerError` |
+| 503 | `ServiceUnavailableError` |
+| other 4xx | `ClientStatusError` (fallback) |
+| other 5xx | `ServerStatusError` (fallback) |
+
+The fallback assumes `400 ≤ status < 600`. Statuses outside that range don't raise (they return the response as-is).
+
+## Catching strategies
+
+```python
+from httpware import (
+    AsyncClient,
+    ClientError,
+    StatusError,
+    NetworkError,
+    TimeoutError,
+    NotFoundError,
+    RetryBudgetExhaustedError,
+    BulkheadFullError,
+)
+
+
+async def fetch(client: AsyncClient, user_id: int) -> dict | None:
+    try:
+        return await client.get(f"/users/{user_id}", response_model=dict)
+    except NotFoundError:
+        # Specific status — most precise. Convert to None as the "absent" sentinel.
+        return None
+    except StatusError as exc:
+        # Got a response, but its status was 4xx/5xx and not one we handle specifically.
+        # exc.response.* is available — headers, content, request, etc.
+        _LOGGER.warning("upstream returned %s for %s", exc.response.status_code, exc.response.request.url)
+        raise
+    except NetworkError:
+        # Transient transport failure. Already retried by the default Retry middleware
+        # (if installed) when the method was idempotent. Seeing this means retries
+        # exhausted or the method was non-idempotent.
+        raise
+    except (RetryBudgetExhaustedError, BulkheadFullError) as exc:
+        # Resilience refusal — backpressure signal. Back off the caller.
+        _LOGGER.error("resilience refused: %s", exc)
+        raise
+    except ClientError:
+        # Catch-all for anything else httpware raised.
+        raise
+```
+
+`TimeoutError` is doubly-inherited: `except builtins.TimeoutError` and `except OSError` both catch it (matches what `asyncio.wait_for` raises). This lets stdlib-style timeout handling Just Work.
+
+## `exc.response.*` access pattern
+
+For any `StatusError` subclass, the raw `httpx2.Response` is on `exc.response`:
+
+```python
+exc.response.status_code     # 404
+exc.response.headers          # httpx2.Headers — case-insensitive
+exc.response.content          # raw bytes
+exc.response.text             # decoded body
+exc.response.json()           # parsed JSON (raises if not JSON)
+exc.response.request          # the failing httpx2.Request
+exc.response.request.url      # the failing URL (httpx2.URL)
+exc.response.request.method   # the HTTP method
+```
+
+**Security note:** `__repr__` and the exception's summary message strip `user:pass@` userinfo from the URL to avoid leaking credentials in tracebacks. **Query-string secrets are NOT stripped** — keep secrets out of query strings.
+
+## Resilience-error payloads
+
+`RetryBudgetExhaustedError` carries:
+- `last_response: httpx2.Response | None` — the last response observed before the budget refused (None if all failures were transport-level)
+- `last_exception: BaseException | None` — the last exception observed before the budget refused
+- `attempts: int` — number of attempts already completed
+
+`BulkheadFullError` carries:
+- `max_concurrent: int` — the configured cap
+- `acquire_timeout: float | None` — the configured timeout
+
+Use these for caller-side logging / alerting:
+
+```python
+except RetryBudgetExhaustedError as exc:
+    _LOGGER.error(
+        "budget exhausted after %d attempts; last_status=%s",
+        exc.attempts,
+        exc.last_response.status_code if exc.last_response is not None else None,
+    )
+```
+
+## See also
+
+- **[Resilience reference](resilience.md)** — `Retry`, `RetryBudget`, `Bulkhead` parameter tables.
+- **[Middleware guide](middleware.md)** — the `@on_error` decorator can translate exceptions into responses.
+- **`planning/engineering.md` §4** — the formal exception contract.
diff --git a/docs/index.md b/docs/index.md
index 0e50f29..e10a5c9 100644
--- a/docs/index.md
+++ b/docs/index.md
@@ -106,6 +106,10 @@ When installed, `_emit_event` calls `trace.get_current_span().add_event(name, at
 
 ## Where to go next
 
+- **[Resilience reference](resilience.md)** — every parameter on `Retry`, `RetryBudget`, and `Bulkhead`; the retry-rule matrix; Retry-After parsing; budget sharing.
+- **[Middleware guide](middleware.md)** — write your own middleware. Covers the Middleware Protocol, the phase decorators, a worked Request-ID propagation example, and OpenTelemetry wiring.
+- **[Errors reference](errors.md)** — the full exception tree, catching strategies, `exc.response.*` access pattern.
+- **[Testing guide](testing.md)** — mock-transport injection pattern for testing code that uses `httpware`.
 - **[Engineering Notes](https://github.com/modern-python/httpware/blob/main/planning/engineering.md)** — design invariants, the three protocol seams, exception contract, module layout, testing patterns, optional-extras pattern. Lives in the repo at `planning/engineering.md`.
 - **[Contributing](dev/contributing.md)** — setup, conventions, workflow.
 - **[Release notes](https://github.com/modern-python/httpware/releases)** — per-version changelogs.
diff --git a/docs/middleware.md b/docs/middleware.md
new file mode 100644
index 0000000..94db1d0
--- /dev/null
+++ b/docs/middleware.md
@@ -0,0 +1,159 @@
+# Writing custom middleware
+
+`httpware`'s primary extension point is the **Middleware protocol**. Middleware lets you add cross-cutting behavior — request-ID propagation, auth header injection, structured tracing, custom resilience policies, anything that wraps "send a request, get a response" — without subclassing `AsyncClient` or touching the transport.
+
+The built-in `Retry` and `Bulkhead` middleware are themselves implementations of this protocol; nothing about them is privileged. If you want a circuit breaker, a rate limiter, or a header-injecting auth layer, write a middleware. If your need is per-call (not cross-cutting), pass it through `request.extensions=` instead.
+
+## The protocol
+
+Two symbols, both exported from `httpware.middleware`:
+
+```python
+from collections.abc import Awaitable, Callable
+from typing import Protocol, TypeAlias, runtime_checkable
+import httpx2
+
+Next: TypeAlias = Callable[[httpx2.Request], Awaitable[httpx2.Response]]
+
+
+@runtime_checkable
+class Middleware(Protocol):
+    async def __call__(self, request: httpx2.Request, next: Next) -> httpx2.Response: ...
+```
+
+The chain is composed once at `AsyncClient.__init__` and frozen for the client's lifetime. The first entry in `middleware=[...]` is the outermost layer: when you write `middleware=[Bulkhead(...), Retry()]`, the bulkhead sees every request before the retry layer does, so one slot covers all retry attempts of the same call.
+
+Calling `await next(request)` forwards to the next layer (or, eventually, to the terminal that hits `httpx2`). You can:
+
+- **Forward unchanged:** `return await next(request)`
+- **Modify the request first:** mutate `request.headers` (or build a replacement) before forwarding
+- **Inspect or replace the response:** call `await next(...)`, then act on what comes back
+- **Short-circuit:** return a synthesized `httpx2.Response` without calling `next` at all
+- **Wrap the call in error handling:** `try: return await next(...) except ...` to translate failures
+
+Whatever you do, return an `httpx2.Response`. Raising an exception propagates up the chain (Retry catches retryable exceptions; everything else surfaces to the caller).
+
+## Phase decorators
+
+For the common cases where you don't need state-keeping on `self` and don't need to wrap the full `await next(...)` call, `httpware.middleware` exports three decorators that turn a single async function into a `Middleware`:
+
+```python
+from httpware.middleware import before_request, after_response, on_error
+```
+
+| Decorator | Function signature | When to use |
+|---|---|---|
+| `@before_request` | `async (request) -> request` | Transform the outgoing request (add a header, rewrite a URL). |
+| `@after_response` | `async (request, response) -> response` | Transform the incoming response (decode, log, attach metadata). |
+| `@on_error` | `async (request, exc) -> response \| None` | Translate or absorb a failure. Return `None` to re-raise. Catches `Exception` (not `BaseException`), so `asyncio.CancelledError` propagates. |
+
+Brief example — adding an `Authorization` header before every request:
+
+```python
+import httpx2
+
+from httpware import AsyncClient
+from httpware.middleware import before_request
+
+
+@before_request
+async def add_bearer(request: httpx2.Request) -> httpx2.Request:
+    request.headers["Authorization"] = "Bearer secret-token"
+    return request
+
+
+async def main() -> None:
+    async with AsyncClient(base_url="https://api.example.com", middleware=[add_bearer]) as client:
+        await client.get("/me")
+```
+
+**Reach for the raw `Middleware` protocol when:** you need instance state (a counter, a CircuitBreaker's open/closed flag), you need to inspect both the request AND its response (e.g., timing), or you need to interleave behavior around the `await next(...)` call (e.g., emit one log line at the start and one at the end). The decorators are a convenience for the cases where a single function suffices.
+
+## Worked example: request-ID propagation
+
+A `RequestIdMiddleware` that assigns a per-call UUID, injects it as an outgoing header, and logs it alongside the response status. This is the canonical "trace every request through your distributed system" pattern.
+
+```python
+import logging
+import uuid
+
+import httpx2
+
+from httpware import AsyncClient, Retry
+from httpware.middleware import Next
+
+
+_LOGGER = logging.getLogger("myapp.request_id")
+
+
+class RequestIdMiddleware:
+    """Assign a per-call X-Request-Id; log it on response.
+
+    Place OUTSIDE Retry so all attempts of the same call share one ID
+    (so a single call's retries all surface under the same correlation
+    key in your logs, and match the URL attribute on httpware.retry's
+    emitted events).
+    """
+
+    def __init__(self, *, header: str = "X-Request-Id") -> None:
+        self._header = header
+
+    async def __call__(self, request: httpx2.Request, next: Next) -> httpx2.Response:  # noqa: A002
+        request_id = str(uuid.uuid4())
+        request.headers[self._header] = request_id
+        response = await next(request)
+        _LOGGER.info(
+            "request complete",
+            extra={"request_id": request_id, "status": response.status_code},
+        )
+        return response
+
+
+async def main() -> None:
+    async with AsyncClient(
+        base_url="https://api.example.com",
+        middleware=[RequestIdMiddleware(), Retry()],  # ID outside Retry
+    ) as client:
+        await client.get("/users/1")
+```
+
+A note on logger names: the example logs under `myapp.request_id`, NOT under `httpware.*`. The `httpware.*` namespace is reserved for events emitted by the library itself (see [Observability](index.md#observability) — `httpware.retry` and `httpware.bulkhead` are stable contracts). Consumer middleware should use your application's own logger namespace.
+
+The example pairs naturally with the 0.6.0 observability events: a `httpware.retry` `retry.giving_up` log record carries a `url` attribute, and your `RequestIdMiddleware` set an `X-Request-Id` for that same call. Correlate the two in your log aggregator and you have end-to-end visibility from "this user's request" to "we gave up after N retries."
+
+## When NOT to write a middleware
+
+- **Redaction:** Use a `logging.Filter` on the consumer side. `httpware` deliberately does no redaction in-library (per the 0.6.0 observability design).
+- **URL or header validation:** `httpx2` owns it. Don't reimplement.
+- **Per-call behavior that doesn't apply to other calls:** Pass through `request.extensions=` (or the `extensions=` kwarg at the call site) instead. Middleware exists for *cross-cutting* concerns.
+- **HTTP-level span creation for tracing:** Install `opentelemetry-instrumentation-httpx` instead of writing an OTel middleware in httpware. We retired story `5-4` (standalone OTel middleware) for this reason — `opentelemetry-instrumentation-httpx` already covers transport-level tracing, and a separate httpware layer would duplicate it. See `planning/engineering.md` §8.
+
+## Wiring OpenTelemetry
+
+`httpware[otel]` only ships `opentelemetry-api`. To make the observability events emitted by `Retry` and `Bulkhead` visible, you also need:
+
+- An **SDK** (`opentelemetry-sdk`) to actually collect spans
+- An **HTTP instrumentor** (`opentelemetry-instrumentation-httpx`) so each HTTP call creates a span — `httpware`'s events attach to that span via `trace.get_current_span().add_event(...)`
+
+Minimal setup (console exporter for development):
+
+```python
+from opentelemetry import trace
+from opentelemetry.sdk.trace import TracerProvider
+from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter
+from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor
+
+trace.set_tracer_provider(TracerProvider())
+trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
+HTTPXClientInstrumentor().instrument()
+```
+
+After this runs, every `httpware` HTTP call gets an `HTTP <method>` span from the instrumentor, and Retry/Bulkhead observability events appear as span events on it (no extra configuration needed in `httpware` itself — the events fire whenever an active span is present).
+
+For production, swap `ConsoleSpanExporter` for your OTLP/Jaeger/Zipkin exporter. See the [OpenTelemetry Python docs](https://opentelemetry.io/docs/languages/python/) for the full SDK setup.
+
+## See also
+
+- **`planning/engineering.md` §3 (Seam A)** — the formal protocol contract and why the chain is frozen at construction.
+- **`src/httpware/middleware/resilience/`** — `Retry`, `Bulkhead`, `RetryBudget` as real-world consumers of this exact protocol.
+- **[Quick-Start composition example](index.md#with-resilience-middleware)** — composing built-in middleware.
diff --git a/docs/resilience.md b/docs/resilience.md
new file mode 100644
index 0000000..7c1074e
--- /dev/null
+++ b/docs/resilience.md
@@ -0,0 +1,173 @@
+# Resilience reference
+
+`httpware` ships three resilience primitives under `httpware.middleware.resilience`, all composable through the standard [Middleware](middleware.md) chain:
+
+- **`Retry`** — automatic retry of transient failures with full-jitter exponential backoff
+- **`RetryBudget`** — Finagle-style token bucket that bounds the global retry rate to prevent retry storms when downstreams degrade
+- **`Bulkhead`** — concurrency limiter via `asyncio.Semaphore` with bounded acquire-wait
+
+The canonical composition is `middleware=[Bulkhead(...), Retry()]` — `Bulkhead` outside `Retry` so one slot covers all retry attempts of a single call. Reach for the [Middleware guide](middleware.md) when you want to write your own resilience policy.
+
+## `Retry`
+
+```python
+from httpware.middleware.resilience import Retry
+```
+
+| Parameter | Default | Effect |
+|---|---|---|
+| `max_attempts` | `3` | Total tries (including the first). `1` disables retries entirely; `<1` raises `ValueError`. |
+| `base_delay` | `0.1` (s) | Floor for the full-jitter exponential backoff. |
+| `max_delay` | `5.0` (s) | Ceiling for backoff. |
+| `attempt_timeout` | `None` | If set, each individual attempt is wrapped in `asyncio.timeout(attempt_timeout)`. |
+| `retry_status_codes` | `frozenset({408, 429, 502, 503, 504})` | Status codes considered retryable. |
+| `retry_methods` | `frozenset({"GET", "HEAD", "OPTIONS", "PUT", "DELETE"})` | Idempotent methods only by default. POST excluded; pass an explicit frozenset including `"POST"` to retry it. |
+| `respect_retry_after` | `True` | When the response carries a `Retry-After` header on a retryable status, sleep for the header value (clamped to `max_delay`) instead of the jittered backoff. |
+| `budget` | `RetryBudget()` (default-configured) | The token bucket. Pass a shared `RetryBudget` instance to apply one budget across multiple clients. |
+
+### Retry-After parsing
+
+`Retry-After` is parsed as either:
+- **Integer seconds** — `Retry-After: 30` → sleep 30s (clamped to `max_delay`)
+- **HTTP-date** (RFC 5322) — `Retry-After: Wed, 21 Oct 2026 07:28:00 GMT` → sleep until that absolute time (clamped to `max_delay`, floored at 0)
+
+Negative integer values are clamped to 0. Malformed values are ignored, falling back to the jittered backoff.
+
+### Streaming-body refusal
+
+If the request body was an async-iterable, `Retry` refuses to retry — the iterator is consumed after the first attempt and can't replay. The original exception is re-raised with a PEP 678 note:
+
+```
+httpware: not retrying — request body is a stream that cannot replay across attempts
+```
+
+The same refusal note is added at the non-idempotent early-exit sites (when streaming combines with a non-idempotent method). The observability event `httpware.retry` `retry.streaming_refused` fires only at the retryable-failure-path site — see [Observability](index.md#observability).
+
+### Exhaustion behavior
+
+On exhaustion, `Retry` re-raises the *last* exception observed (e.g., `ServiceUnavailableError`, `NetworkError`), preserving the original class so `except ServiceUnavailableError` still catches it. A PEP 678 note is added: `httpware: gave up after N attempts`.
+
+If exhaustion is caused by the budget refusing a retry (not by `max_attempts`), the raised exception is `RetryBudgetExhaustedError` instead, with `last_response` / `last_exception` / `attempts` fields populated. See the [Errors reference](errors.md).
+
+## `RetryBudget`
+
+```python
+from httpware.middleware.resilience import RetryBudget
+```
+
+A Finagle-style token bucket bounding retry rate. Each request deposits a token; each retry attempts to withdraw one. Available retries are bounded by `percent_can_retry` of recent deposits, plus a `min_retries_per_sec * ttl` floor.
+
+| Parameter | Default | Effect |
+|---|---|---|
+| `ttl` | `10.0` (s) | Sliding window over which deposits and withdrawals count. |
+| `min_retries_per_sec` | `10.0` | Absolute floor — at least this many retries/sec are permitted regardless of deposit rate. |
+| `percent_can_retry` | `0.2` | Fraction of recent deposits that can convert to retries (above the floor). |
+
+### The token-bucket formula
+
+```
+ceiling = int(len(deposits_in_window) * percent_can_retry) + int(min_retries_per_sec * ttl)
+```
+
+A withdrawal fails when `len(withdrawn_in_window) >= ceiling`.
+
+### Why a floor matters
+
+If the deposit rate is zero (no traffic yet), the percent term is zero — without the floor, the very first retry would be refused. The floor lets small-traffic clients still retry on isolated failures; high-traffic clients are dominated by the percent term and the floor becomes irrelevant.
+
+### Sharing across clients
+
+Pass the same `RetryBudget` instance to multiple `AsyncClient`s when they hit the same downstream — one joint budget covers them all:
+
+```python
+import asyncio
+
+from httpware import AsyncClient
+from httpware.middleware.resilience import Retry, RetryBudget
+
+
+shared = RetryBudget()
+
+
+async def main() -> None:
+    async with (
+        AsyncClient(base_url="https://api.example.com", middleware=[Retry(budget=shared)]) as users,
+        AsyncClient(base_url="https://api.example.com", middleware=[Retry(budget=shared)]) as orders,
+    ):
+        await asyncio.gather(users.get("/users/1"), orders.get("/orders/1"))
+```
+
+### Single-thread assumption
+
+`RetryBudget` is asyncio-aware — deque mutations between await points are atomic on a single event loop. Cross-thread use is out of scope; if you need that, wrap calls in a lock yourself.
+
+## `Bulkhead`
+
+```python
+from httpware.middleware.resilience import Bulkhead
+```
+
+Concurrency limiter via `asyncio.Semaphore`. Acquires a slot before each request (bounded by `acquire_timeout`); releases on success, exception, AND cancellation.
+
+| Parameter | Default | Effect |
+|---|---|---|
+| `max_concurrent` | **REQUIRED** | Maximum in-flight requests. `<1` raises `ValueError`. No default — the right cap depends on downstream capacity and SLA. |
+| `acquire_timeout` | `1.0` (s) | How long to wait for a slot before raising `BulkheadFullError`. `None` waits forever; `0` fails fast. `<0` raises `ValueError`. |
+
+### Slot release contract
+
+The slot is released in a `try/finally` around `await next(request)`, so all three exit paths release deterministically:
+- **Success** — slot released after the response returns
+- **Exception** — slot released before the exception propagates
+- **Cancellation** — slot released as the `CancelledError` propagates
+
+The slot cannot leak.
+
+### Sharing across clients
+
+Same pattern as `RetryBudget`. One instance, many clients:
+
+```python
+shared_bulkhead = Bulkhead(max_concurrent=10)
+
+async with (
+    AsyncClient(base_url="https://api.example.com", middleware=[shared_bulkhead, Retry()]) as a,
+    AsyncClient(base_url="https://api.example.com", middleware=[shared_bulkhead, Retry()]) as b,
+):
+    ...  # combined in-flight across a + b is capped at 10
+```
+
+### Rejection
+
+When `acquire_timeout` elapses without a slot opening, `Bulkhead` raises `BulkheadFullError` (carries the configured `max_concurrent` and `acquire_timeout` for caller logging). See the [Errors reference](errors.md). The `httpware.bulkhead` `bulkhead.rejected` observability event fires at the same site — see [Observability](index.md#observability).
+
+## Composition
+
+The canonical ordering is `middleware=[Bulkhead, Retry]` — `Bulkhead` outermost so one slot covers all retry attempts of a single call:
+
+```python
+from httpware import AsyncClient
+from httpware.middleware.resilience import Bulkhead, Retry
+
+
+async def main() -> None:
+    async with AsyncClient(
+        base_url="https://api.example.com",
+        middleware=[
+            Bulkhead(max_concurrent=10),
+            Retry(),
+        ],
+    ) as client:
+        await client.get("/users/1")
+```
+
+Flipping the order (`[Retry, Bulkhead]`) means each retry attempt grabs a fresh slot — defeating the bulkhead under load. Don't do that.
+
+Cross-cutting middleware that emit per-call state (e.g., the Request-ID middleware in the [Middleware guide](middleware.md)) should sit outside `Retry` for the same reason — so all attempts of one call share one ID rather than getting a fresh ID per attempt.
+
+## See also
+
+- **[Middleware guide](middleware.md)** — write your own resilience middleware against the same protocol `Retry` and `Bulkhead` use.
+- **[Errors reference](errors.md)** — `RetryBudgetExhaustedError`, `BulkheadFullError`, and the broader exception tree.
+- **[Observability](index.md#observability)** — the four operational events these middleware emit.
+- **`planning/engineering.md` §3** — the formal Middleware/Seam-A contract.
diff --git a/docs/testing.md b/docs/testing.md
new file mode 100644
index 0000000..79cda44
--- /dev/null
+++ b/docs/testing.md
@@ -0,0 +1,88 @@
+# Testing guide
+
+`httpware`'s test seam is `httpx2`. Pass any `httpx2.AsyncClient` (including one built on `httpx2.MockTransport`) to `AsyncClient(httpx2_client=...)` — the middleware chain still runs end-to-end, only the wire is mocked. No special test mode, no monkey-patching, no `respx`.
+
+## The basic pattern
+
+```python
+from http import HTTPStatus
+
+import httpx2
+
+from httpware import AsyncClient
+
+
+def handler(request: httpx2.Request) -> httpx2.Response:
+    return httpx2.Response(HTTPStatus.OK, json={"id": 1, "name": "Alice"})
+
+
+async def test_get_user() -> None:
+    transport = httpx2.MockTransport(handler)
+    async with AsyncClient(httpx2_client=httpx2.AsyncClient(transport=transport)) as client:
+        response = await client.get("https://api.example.test/users/1")
+    assert response.status_code == HTTPStatus.OK
+    assert response.json()["name"] == "Alice"
+```
+
+The handler can be sync or async; `httpx2.MockTransport` supports both. The test above uses a sync handler.
+
+If you use `pytest-asyncio` in auto-mode (`asyncio_mode = "auto"` under `[tool.pytest.ini_options]`), async test functions don't need the `@pytest.mark.asyncio` decorator.
+
+## Recording / stateful handlers
+
+For tests that need to vary the response by call count or assert on the requests that came in, use a handler with instance state:
+
+```python
+class _ResponseSequence:
+    """Returns each status in order; records every request received."""
+
+    def __init__(self, statuses: list[int]) -> None:
+        self._statuses = list(statuses)
+        self.calls: list[httpx2.Request] = []
+
+    def __call__(self, request: httpx2.Request) -> httpx2.Response:
+        self.calls.append(request)
+        status = self._statuses.pop(0) if self._statuses else HTTPStatus.OK
+        return httpx2.Response(status, request=request)
+
+
+async def test_retry_succeeds_after_503() -> None:
+    handler = _ResponseSequence([HTTPStatus.SERVICE_UNAVAILABLE, HTTPStatus.OK])
+    transport = httpx2.MockTransport(handler)
+    async with AsyncClient(
+        httpx2_client=httpx2.AsyncClient(transport=transport),
+        middleware=[Retry(base_delay=0.001, max_delay=0.002)],
+    ) as client:
+        response = await client.get("https://example.test/x")
+    assert response.status_code == HTTPStatus.OK
+    assert len(handler.calls) == 2  # initial + 1 retry
+```
+
+The `base_delay`/`max_delay` are set tiny so the test runs instantly — no need for `freezegun` or sleep injection in most cases.
+
+## Testing your custom middleware
+
+Compose your middleware with the mock transport to exercise the chain end-to-end:
+
+```python
+async def test_my_middleware_adds_header() -> None:
+    handler = _ResponseSequence([HTTPStatus.OK])
+    async with AsyncClient(
+        httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)),
+        middleware=[MyHeaderMiddleware()],
+    ) as client:
+        await client.get("https://example.test/x")
+    assert handler.calls[0].headers["X-My-Header"] == "expected-value"
+```
+
+For middleware with state-keeping (counters, circuit-breaker state), assert on instance attributes after running the call.
+
+## Why not `respx`?
+
+`httpware` deliberately uses `httpx2.MockTransport` instead of `respx` for its own tests. `MockTransport` is the public test seam in `httpx` — supported by the maintainers, stable across versions, lives in the public API surface. `respx` patches private internals and has historically broken across `httpx` major versions. Stick with `MockTransport` unless you have a specific reason not to.
+
+## See also
+
+- **[Middleware guide](middleware.md)** — write the middleware you're testing.
+- **[Resilience reference](resilience.md)** — testing `Retry`/`Bulkhead` configurations.
+- **`planning/engineering.md` §6** — the project's own testing patterns (Hypothesis property-based tests, `pytest-asyncio` auto-mode, the `RecordedTransport`-was-removed history).
diff --git a/mkdocs.yml b/mkdocs.yml
index cf0d0aa..e9afd1f 100644
--- a/mkdocs.yml
+++ b/mkdocs.yml
@@ -6,6 +6,10 @@ edit_uri: edit/main/docs/
 
 nav:
   - Quick-Start: index.md
+  - Resilience: resilience.md
+  - Middleware: middleware.md
+  - Errors: errors.md
+  - Testing: testing.md
   - Development:
       - Contributing: dev/contributing.md
 
diff --git a/planning/engineering.md b/planning/engineering.md
index 6d80123..70c8dc9 100644
--- a/planning/engineering.md
+++ b/planning/engineering.md
@@ -131,7 +131,9 @@ Post-pivot, the roadmap has three categories. Topic slugs in `planning/specs/` a
 - **Epic 3 — Resilience:**
   - **Shipped in v0.4 slice 1:** `Retry` middleware + Finagle-style `RetryBudget` token bucket + `attempt_timeout=` parameter (folded-in 3-1). See [`planning/specs/2026-06-05-retry-and-retry-budget-design.md`](specs/2026-06-05-retry-and-retry-budget-design.md) and [`planning/plans/2026-06-05-retry-and-retry-budget-plan.md`](plans/2026-06-05-retry-and-retry-budget-plan.md).
   - **Shipped in v0.4 slice 2:** `Bulkhead` middleware (concurrency limiter via `asyncio.Semaphore` with bounded acquire wait). See [`planning/specs/2026-06-05-bulkhead-design.md`](specs/2026-06-05-bulkhead-design.md) and [`planning/plans/2026-06-05-bulkhead-plan.md`](plans/2026-06-05-bulkhead-plan.md).
-  - **Remaining:** `3-6` extension-slot docs.
+  - **Shipped in v0.7:** `3-6` extension-slot docs — [`docs/middleware.md`](../docs/middleware.md). Covers the Middleware Protocol, phase decorators, a Request-ID worked example, and "when NOT to write a middleware." See [`planning/specs/2026-06-05-extension-slot-docs-design.md`](specs/2026-06-05-extension-slot-docs-design.md) and [`planning/plans/2026-06-05-extension-slot-docs-plan.md`](plans/2026-06-05-extension-slot-docs-plan.md).
+  - **v0.7 also bundles** the rest of the first-cut user docs surface — [`docs/resilience.md`](../docs/resilience.md) (Retry/RetryBudget/Bulkhead reference), [`docs/errors.md`](../docs/errors.md) (exception tree + catching strategies), [`docs/testing.md`](../docs/testing.md) (mock-transport injection pattern) — plus an "OpenTelemetry wiring" section appended to `docs/middleware.md`. See [`planning/specs/2026-06-05-v0.7-docs-expansion-design.md`](specs/2026-06-05-v0.7-docs-expansion-design.md) and [`planning/plans/2026-06-05-v0.7-docs-expansion-plan.md`](plans/2026-06-05-v0.7-docs-expansion-plan.md).
+  - **Epic 3 closed.**
 - **Epic 4 — Streaming:** SHIPPED in v0.5 (PR #…): `AsyncClient.stream()` context manager + Retry refuses streamed-body requests. See [`planning/specs/2026-06-05-streaming-design.md`](specs/2026-06-05-streaming-design.md) and [`planning/plans/2026-06-05-streaming-plan.md`](plans/2026-06-05-streaming-plan.md).
 - **Epic 5 — Observability:** SHIPPED in v0.6 (PR #…) — re-scoped from the original 4-story plan. `Retry` and `Bulkhead` emit operational events via stdlib `logging` + opt-in OpenTelemetry span events. Stories `5-1` (Layer 1 middleware hooks) and `5-4` (standalone OTel middleware) RETIRED — `opentelemetry-instrumentation-httpx` already covers transport-level tracing; a separate httpware middleware would duplicate it. See [`planning/specs/2026-06-05-observability-design.md`](specs/2026-06-05-observability-design.md) and [`planning/plans/2026-06-05-observability-plan.md`](plans/2026-06-05-observability-plan.md).
 - **Epic 6 — Ship v1.0:** `6-2` docs site (`mkdocs`), `6-3` benchmarks, `6-5` release flow (Trusted Publishers + Sigstore).
diff --git a/planning/plans/2026-06-05-extension-slot-docs-plan.md b/planning/plans/2026-06-05-extension-slot-docs-plan.md
new file mode 100644
index 0000000..5854561
--- /dev/null
+++ b/planning/plans/2026-06-05-extension-slot-docs-plan.md
@@ -0,0 +1,532 @@
+# Extension-slot docs (0.7.0, Epic 3 story 3-6) Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Ship `docs/middleware.md` — a user-facing guide to writing custom middleware against `httpware`'s Middleware protocol — plus the four small touchups that hang off it (mkdocs nav, README pointer, docs/index pointer, engineering.md §8 SHIPPED line) and 0.7.0 release notes. Closes Epic 3.
+
+**Architecture:** Docs-only PR. One new markdown page (~150 lines), four small textual edits to existing files, one new release-notes file. No source code changes. Verification is `mkdocs build --strict` + link resolution + the existing test/lint suites as no-op confirmation.
+
+**Tech Stack:** Markdown, mkdocs-material (strict build), no source code.
+
+**Target branch:** `feat/v0.7-middleware-docs`. Create from `main` before Task 1: `git checkout main && git pull && git checkout -b feat/v0.7-middleware-docs`.
+
+**Source spec:** [`planning/specs/2026-06-05-extension-slot-docs-design.md`](../specs/2026-06-05-extension-slot-docs-design.md). Read the spec's "Background" + "Deliverable" sections before starting — the *why* for non-resilience example choice and Seam-A-only scope lives there.
+
+---
+
+## File structure
+
+**New files:**
+- `docs/middleware.md` — the guide itself (~150 lines)
+- `planning/releases/0.7.0.md` — release notes
+
+**Modified files:**
+- `mkdocs.yml` — add nav entry between Quick-Start and Development
+- `README.md` — one-sentence pointer in the existing "With resilience middleware" subsection
+- `docs/index.md` — one bullet in the existing "Where to go next" section
+- `planning/engineering.md` §8 — replace the "**Remaining:** `3-6` extension-slot docs." line under Epic 3
+
+**Commit cadence:** one commit per task. Per-task commits keep history reviewable.
+
+---
+
+## Task 1: Branch + create `docs/middleware.md`
+
+**Files:**
+- Create: `docs/middleware.md`
+
+- [ ] **Step 1: Create the branch**
+
+```bash
+git checkout main && git pull && git checkout -b feat/v0.7-middleware-docs
+```
+Expected: switched to a new branch.
+
+- [ ] **Step 2: Create `docs/middleware.md` with the full content below**
+
+````markdown
+# Writing custom middleware
+
+`httpware`'s primary extension point is the **Middleware protocol**. Middleware lets you add cross-cutting behavior — request-ID propagation, auth header injection, structured tracing, custom resilience policies, anything that wraps "send a request, get a response" — without subclassing `AsyncClient` or touching the transport.
+
+The built-in `Retry` and `Bulkhead` middleware are themselves implementations of this protocol; nothing about them is privileged. If you want a circuit breaker, a rate limiter, or a header-injecting auth layer, write a middleware. If your need is per-call (not cross-cutting), pass it through `request.extensions=` instead.
+
+## The protocol
+
+Two symbols, both exported from `httpware.middleware`:
+
+```python
+from collections.abc import Awaitable, Callable
+from typing import Protocol, TypeAlias, runtime_checkable
+import httpx2
+
+Next: TypeAlias = Callable[[httpx2.Request], Awaitable[httpx2.Response]]
+
+
+@runtime_checkable
+class Middleware(Protocol):
+    async def __call__(self, request: httpx2.Request, next: Next) -> httpx2.Response: ...
+```
+
+The chain is composed once at `AsyncClient.__init__` and frozen for the client's lifetime. The first entry in `middleware=[...]` is the outermost layer: when you write `middleware=[Bulkhead(...), Retry()]`, the bulkhead sees every request before the retry layer does, so one slot covers all retry attempts of the same call.
+
+Calling `await next(request)` forwards to the next layer (or, eventually, to the terminal that hits `httpx2`). You can:
+
+- **Forward unchanged:** `return await next(request)`
+- **Modify the request first:** mutate `request.headers` (or build a replacement) before forwarding
+- **Inspect or replace the response:** call `await next(...)`, then act on what comes back
+- **Short-circuit:** return a synthesized `httpx2.Response` without calling `next` at all
+- **Wrap the call in error handling:** `try: return await next(...) except ...` to translate failures
+
+Whatever you do, return an `httpx2.Response`. Raising an exception propagates up the chain (Retry catches retryable exceptions; everything else surfaces to the caller).
+
+## Phase decorators
+
+For the common cases where you don't need state-keeping on `self` and don't need to wrap the full `await next(...)` call, `httpware.middleware` exports three decorators that turn a single async function into a `Middleware`:
+
+```python
+from httpware.middleware import before_request, after_response, on_error
+```
+
+| Decorator | Function signature | When to use |
+|---|---|---|
+| `@before_request` | `async (request) -> request` | Transform the outgoing request (add a header, rewrite a URL). |
+| `@after_response` | `async (request, response) -> response` | Transform the incoming response (decode, log, attach metadata). |
+| `@on_error` | `async (request, exc) -> response \| None` | Translate or absorb a failure. Return `None` to re-raise. Catches `Exception` (not `BaseException`), so `asyncio.CancelledError` propagates. |
+
+Brief example — adding an `Authorization` header before every request:
+
+```python
+import httpx2
+
+from httpware import AsyncClient
+from httpware.middleware import before_request
+
+
+@before_request
+async def add_bearer(request: httpx2.Request) -> httpx2.Request:
+    request.headers["Authorization"] = "Bearer secret-token"
+    return request
+
+
+async def main() -> None:
+    async with AsyncClient(base_url="https://api.example.com", middleware=[add_bearer]) as client:
+        await client.get("/me")
+```
+
+**Reach for the raw `Middleware` protocol when:** you need instance state (a counter, a CircuitBreaker's open/closed flag), you need to inspect both the request AND its response (e.g., timing), or you need to interleave behavior around the `await next(...)` call (e.g., emit one log line at the start and one at the end). The decorators are a convenience for the cases where a single function suffices.
+
+## Worked example: request-ID propagation
+
+A `RequestIdMiddleware` that assigns a per-call UUID, injects it as an outgoing header, and logs it alongside the response status. This is the canonical "trace every request through your distributed system" pattern.
+
+```python
+import logging
+import uuid
+
+import httpx2
+
+from httpware import AsyncClient, Retry
+from httpware.middleware import Next
+
+
+_LOGGER = logging.getLogger("myapp.request_id")
+
+
+class RequestIdMiddleware:
+    """Assign a per-call X-Request-Id; log it on response.
+
+    Place OUTSIDE Retry so all attempts of the same call share one ID
+    (so a single call's retries all surface under the same correlation
+    key in your logs, and match the URL attribute on httpware.retry's
+    emitted events).
+    """
+
+    def __init__(self, *, header: str = "X-Request-Id") -> None:
+        self._header = header
+
+    async def __call__(self, request: httpx2.Request, next: Next) -> httpx2.Response:  # noqa: A002
+        request_id = str(uuid.uuid4())
+        request.headers[self._header] = request_id
+        response = await next(request)
+        _LOGGER.info(
+            "request complete",
+            extra={"request_id": request_id, "status": response.status_code},
+        )
+        return response
+
+
+async def main() -> None:
+    async with AsyncClient(
+        base_url="https://api.example.com",
+        middleware=[RequestIdMiddleware(), Retry()],  # ID outside Retry
+    ) as client:
+        await client.get("/users/1")
+```
+
+A note on logger names: the example logs under `myapp.request_id`, NOT under `httpware.*`. The `httpware.*` namespace is reserved for events emitted by the library itself (see [Observability](index.md#observability) — `httpware.retry` and `httpware.bulkhead` are stable contracts). Consumer middleware should use your application's own logger namespace.
+
+The example pairs naturally with the 0.6.0 observability events: a `httpware.retry` `retry.giving_up` log record carries a `url` attribute, and your `RequestIdMiddleware` set an `X-Request-Id` for that same call. Correlate the two in your log aggregator and you have end-to-end visibility from "this user's request" to "we gave up after N retries."
+
+## When NOT to write a middleware
+
+- **Redaction:** Use a `logging.Filter` on the consumer side. `httpware` deliberately does no redaction in-library (per the 0.6.0 observability design).
+- **URL or header validation:** `httpx2` owns it. Don't reimplement.
+- **Per-call behavior that doesn't apply to other calls:** Pass through `request.extensions=` (or the `extensions=` kwarg at the call site) instead. Middleware exists for *cross-cutting* concerns.
+- **HTTP-level span creation for tracing:** Install `opentelemetry-instrumentation-httpx` instead of writing an OTel middleware in httpware. We retired story `5-4` (standalone OTel middleware) for this reason — `opentelemetry-instrumentation-httpx` already covers transport-level tracing, and a separate httpware layer would duplicate it. See `planning/engineering.md` §8.
+
+## See also
+
+- **`planning/engineering.md` §3 (Seam A)** — the formal protocol contract and why the chain is frozen at construction.
+- **`src/httpware/middleware/resilience/`** — `Retry`, `Bulkhead`, `RetryBudget` as real-world consumers of this exact protocol.
+- **[Quick-Start composition example](index.md#with-resilience-middleware)** — composing built-in middleware.
+````
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add docs/middleware.md
+git commit -m "docs(middleware): write custom-middleware guide (3-6)
+
+New docs/middleware.md covering:
+- The Middleware Protocol + Next type, exported from httpware.middleware
+- Phase decorators (@before_request, @after_response, @on_error) as
+  ergonomic shortcuts for the no-state-keeping cases
+- Worked example: a RequestIdMiddleware that assigns a per-call UUID
+  via X-Request-Id and logs it alongside the response status. Placed
+  outside Retry on purpose so all attempts of the same call share one
+  ID and correlate with the 0.6.0 observability events' url attribute
+- 'When NOT to write a middleware' section covering redaction (use a
+  logging.Filter), URL/header validation (httpx2 owns it), per-call
+  behavior (use request.extensions=), and HTTP-tracing (install
+  opentelemetry-instrumentation-httpx instead)
+
+Closes the deferred-tutorial half of story 3-6. See spec at
+planning/specs/2026-06-05-extension-slot-docs-design.md."
+```
+
+---
+
+## Task 2: Add nav entry to `mkdocs.yml` + verify strict build
+
+**Files:**
+- Modify: `mkdocs.yml`
+
+- [ ] **Step 1: Add nav entry**
+
+The current `nav:` block reads:
+```yaml
+nav:
+  - Quick-Start: index.md
+  - Development:
+      - Contributing: dev/contributing.md
+```
+
+Change to:
+```yaml
+nav:
+  - Quick-Start: index.md
+  - Middleware: middleware.md
+  - Development:
+      - Contributing: dev/contributing.md
+```
+
+- [ ] **Step 2: Verify mkdocs strict build is clean**
+
+```bash
+uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -20
+```
+Expected: `Documentation built in <time>` with no warnings about missing files, broken links, or unrecognized cross-references. Strict mode treats warnings as errors.
+
+The new page links to `index.md#with-resilience-middleware`, `index.md#observability`, and uses the path `planning/engineering.md` (the latter is a repo path, not a docs path — mkdocs won't try to resolve it as an internal anchor, which is the intent).
+
+If strict build complains about anchors, the failure mode is usually: header text in `docs/index.md` doesn't slug-to what we expected. The auto-generated slugs are:
+- "## With resilience middleware" → `#with-resilience-middleware`
+- "## Observability" → `#observability`
+
+Both exist verbatim in the current `docs/index.md`.
+
+- [ ] **Step 3: Clean up the local site/ directory and commit**
+
+```bash
+rm -rf site/
+git add mkdocs.yml
+git commit -m "docs(nav): add Middleware page to mkdocs nav (3-6)
+
+Inserts between Quick-Start and Development. The page itself (added
+in the prior commit) is reachable from the Quick-Start's resilience
+section and the README; this nav slot is for users browsing the
+docs site directly."
+```
+
+---
+
+## Task 3: README.md pointer
+
+**Files:**
+- Modify: `README.md`
+
+- [ ] **Step 1: Append a pointer sentence to the existing "With resilience middleware" subsection**
+
+The current subsection (around L45-L62) ends with the resilience code block. Append a one-sentence pointer immediately after the closing triple-backtick of that code block (so it sits above the next subsection `### Streaming responses`).
+
+Find:
+```markdown
+    ) as client:
+        user = await client.get("/users/1", response_model=User)
+```
+```
+
+(The trailing ``` is the end of the code fence.)
+
+Add ONE blank line, then this sentence, then another blank line before `### Streaming responses`:
+
+```markdown
+Need a custom middleware (auth, tracing, request-ID propagation, etc.)? See the [Middleware guide](docs/middleware.md).
+```
+
+So the surrounding context becomes:
+```markdown
+    ) as client:
+        user = await client.get("/users/1", response_model=User)
+```
+
+Need a custom middleware (auth, tracing, request-ID propagation, etc.)? See the [Middleware guide](docs/middleware.md).
+
+### Streaming responses
+```
+
+- [ ] **Step 2: Verify the link works locally**
+
+The README is rendered on GitHub. A relative link `docs/middleware.md` from a repo-root README resolves to `<repo>/blob/main/docs/middleware.md` automatically. Visual-check by opening README.md in any markdown previewer and confirming the link clicks through.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add README.md
+git commit -m "docs(readme): link to new Middleware guide (3-6)
+
+One-sentence pointer in the existing 'With resilience middleware'
+subsection. Surfaces the new guide for users skimming the README who
+want to write their own middleware."
+```
+
+---
+
+## Task 4: docs/index.md pointer
+
+**Files:**
+- Modify: `docs/index.md`
+
+- [ ] **Step 1: Add a bullet to the existing "Where to go next" section**
+
+The current section (around L107-L111) reads:
+```markdown
+## Where to go next
+
+- **[Engineering Notes](https://github.com/modern-python/httpware/blob/main/planning/engineering.md)** — design invariants, the three protocol seams, exception contract, module layout, testing patterns, optional-extras pattern. Lives in the repo at `planning/engineering.md`.
+- **[Contributing](dev/contributing.md)** — setup, conventions, workflow.
+- **[Release notes](https://github.com/modern-python/httpware/releases)** — per-version changelogs.
+```
+
+Insert a new bullet as the FIRST bullet in that list (above Engineering Notes), since the Middleware guide is the most user-facing of the four entries:
+
+```markdown
+- **[Middleware guide](middleware.md)** — write your own middleware. Covers the Middleware Protocol, the phase decorators, and a worked Request-ID propagation example.
+```
+
+So the section becomes:
+```markdown
+## Where to go next
+
+- **[Middleware guide](middleware.md)** — write your own middleware. Covers the Middleware Protocol, the phase decorators, and a worked Request-ID propagation example.
+- **[Engineering Notes](https://github.com/modern-python/httpware/blob/main/planning/engineering.md)** — design invariants, the three protocol seams, exception contract, module layout, testing patterns, optional-extras pattern. Lives in the repo at `planning/engineering.md`.
+- **[Contributing](dev/contributing.md)** — setup, conventions, workflow.
+- **[Release notes](https://github.com/modern-python/httpware/releases)** — per-version changelogs.
+```
+
+- [ ] **Step 2: Verify mkdocs strict build still clean**
+
+```bash
+uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -10
+rm -rf site/
+```
+Expected: still clean (the `middleware.md` link resolves now that Task 2 added it to nav).
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add docs/index.md
+git commit -m "docs(index): link to Middleware guide from Where-to-go-next (3-6)
+
+Adds the guide as the first bullet — most user-facing of the four
+entries in that section."
+```
+
+---
+
+## Task 5: `planning/engineering.md` §8 — mark 3-6 SHIPPED
+
+**Files:**
+- Modify: `planning/engineering.md`
+
+- [ ] **Step 1: Replace the Epic 3 Remaining line**
+
+The current Epic 3 block in §8 (around L131-L134) reads:
+```markdown
+- **Epic 3 — Resilience:**
+  - **Shipped in v0.4 slice 1:** `Retry` middleware + Finagle-style `RetryBudget` token bucket + `attempt_timeout=` parameter (folded-in 3-1). See [`planning/specs/2026-06-05-retry-and-retry-budget-design.md`](specs/2026-06-05-retry-and-retry-budget-design.md) and [`planning/plans/2026-06-05-retry-and-retry-budget-plan.md`](plans/2026-06-05-retry-and-retry-budget-plan.md).
+  - **Shipped in v0.4 slice 2:** `Bulkhead` middleware (concurrency limiter via `asyncio.Semaphore` with bounded acquire wait). See [`planning/specs/2026-06-05-bulkhead-design.md`](specs/2026-06-05-bulkhead-design.md) and [`planning/plans/2026-06-05-bulkhead-plan.md`](plans/2026-06-05-bulkhead-plan.md).
+  - **Remaining:** `3-6` extension-slot docs.
+```
+
+Replace the `- **Remaining:** ...` line with:
+```markdown
+  - **Shipped in v0.7:** `3-6` extension-slot docs — [`docs/middleware.md`](../docs/middleware.md). Covers the Middleware Protocol, phase decorators, a Request-ID worked example, and "when NOT to write a middleware." See [`planning/specs/2026-06-05-extension-slot-docs-design.md`](specs/2026-06-05-extension-slot-docs-design.md) and [`planning/plans/2026-06-05-extension-slot-docs-plan.md`](plans/2026-06-05-extension-slot-docs-plan.md).
+  - **Epic 3 closed.**
+```
+
+- [ ] **Step 2: Commit**
+
+```bash
+git add planning/engineering.md
+git commit -m "docs(engineering): mark Epic 3 closed (3-6 shipped in v0.7)
+
+§8 now records the extension-slot docs as shipped in v0.7 and notes
+Epic 3 closed. The remaining roadmap collapses to Epic 6 (ship v1.0)
+plus Epic 5's already-shipped observability work."
+```
+
+---
+
+## Task 6: Create `planning/releases/0.7.0.md`
+
+**Files:**
+- Create: `planning/releases/0.7.0.md`
+
+- [ ] **Step 1: Write the release notes**
+
+Create `planning/releases/0.7.0.md`:
+
+```markdown
+# httpware 0.7.0 — Middleware extension guide (docs-only)
+
+**0.7.0 is a docs-only release. No API changes.** Code written against 0.6.0 continues to work unchanged.
+
+This release ships the final piece of Epic 3 — a user-facing guide to writing custom middleware against `httpware`'s Middleware protocol. With it, Epic 3 (Resilience) closes.
+
+## What's new
+
+- **[`docs/middleware.md`](../../docs/middleware.md)** — a new top-level docs page covering:
+  - The `Middleware` Protocol and `Next` type alias, both exported from `httpware.middleware`.
+  - The three phase decorators (`@before_request`, `@after_response`, `@on_error`) as ergonomic shortcuts for the common cases.
+  - A worked `RequestIdMiddleware` example — assign a per-call UUID, propagate via `X-Request-Id`, log it alongside the response status. Placed outside `Retry` so all attempts of one call share one ID, and correlates naturally with the 0.6.0 observability events' `url` attribute.
+  - A "when NOT to write a middleware" section pointing redaction at `logging.Filter`, URL/header validation at `httpx2`, per-call behavior at `request.extensions=`, and HTTP-level tracing at `opentelemetry-instrumentation-httpx`.
+
+Plus small touchups so the guide is discoverable: a nav entry in `mkdocs.yml`, a one-sentence pointer in the README, and a "Where to go next" bullet in `docs/index.md`.
+
+## What's not in this release
+
+- No source code changes. The `Middleware` protocol, `Next` type, and phase decorators all already existed (shipped pre-0.4 via Epic 2); this release documents them.
+- No new built-in middleware (no CircuitBreaker, no RateLimiter, no metrics counter). The deliberate non-resilience worked-example choice keeps the guide focused on teaching the protocol rather than shipping a half-baked toy that gets cargo-culted.
+- No mkdocs publish workflow / docs-site infra. That's Epic 6 story `6-2`; this release just makes the strict build green.
+
+## Epic 3 closed
+
+Epic 3 (Resilience) has shipped end-to-end:
+- v0.4 slice 1 — `Retry` + `RetryBudget` + `attempt_timeout=`
+- v0.4 slice 2 — `Bulkhead`
+- v0.7 — extension-slot docs
+
+Remaining roadmap is Epic 6 (ship v1.0): `6-2` docs site infrastructure, `6-3` benchmarks, `6-5` release flow (Trusted Publishers + Sigstore).
+
+## References
+
+- Spec: [`planning/specs/2026-06-05-extension-slot-docs-design.md`](../specs/2026-06-05-extension-slot-docs-design.md)
+- Plan: [`planning/plans/2026-06-05-extension-slot-docs-plan.md`](../plans/2026-06-05-extension-slot-docs-plan.md)
+- Roadmap: [`planning/engineering.md`](../engineering.md) §8
+```
+
+- [ ] **Step 2: Commit**
+
+```bash
+git add planning/releases/0.7.0.md
+git commit -m "docs: 0.7.0 release notes — middleware guide + Epic 3 closed
+
+Docs-only release. Calls out the new docs/middleware.md page, notes the
+non-goals (no source changes, no new built-in middleware, no docs-site
+infra), and records Epic 3 as closed end-to-end after v0.4 + v0.7."
+```
+
+---
+
+## Task 7: Final verification + push
+
+**Files:** none modified; verification only.
+
+- [ ] **Step 1: Lint-ci (sanity)**
+
+```bash
+just lint-ci
+```
+Expected: clean. Lint runs against source code, not docs, so this is a pure no-op confirmation that we haven't accidentally touched a source file.
+
+- [ ] **Step 2: Full test suite (sanity)**
+
+```bash
+just test
+```
+Expected: 251 passed, 100% coverage. Same no-op confirmation logic — no source touched.
+
+- [ ] **Step 3: mkdocs strict build**
+
+```bash
+uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -20
+rm -rf site/
+```
+Expected: `Documentation built in <time>` with zero warnings about missing files, broken anchors, or unrecognized links. Strict mode treats warnings as errors.
+
+- [ ] **Step 4: Manual cross-reference scan**
+
+```bash
+grep -nE '\]\(' docs/middleware.md
+```
+
+Each link should be one of:
+- `index.md#with-resilience-middleware` (resolves — section exists in `docs/index.md`)
+- `index.md#observability` (resolves — section exists in `docs/index.md`)
+
+Repo paths like `planning/engineering.md` are inline references in prose (not markdown links) so they don't need to resolve as anchors.
+
+- [ ] **Step 5: Architecture invariants (sanity)**
+
+```bash
+grep -rE 'httpx2\._' src/httpware/ || echo "PASS: no httpx2 private API"
+grep -rE 'from __future__ import annotations' src/httpware/ || echo "PASS: no __future__ annotations"
+grep -rE '\bprint\(' src/httpware/ || echo "PASS: no print()"
+grep -rE 'logging\.(basicConfig|getLogger)\(\)' src/httpware/ || echo "PASS: no global logging"
+grep -rE '# (type|mypy): ignore' src/httpware/ || echo "PASS: no type/mypy ignore"
+```
+Each should print PASS. (Docs-only PR — these are unchanged from main, just confirming we haven't drifted.)
+
+- [ ] **Step 6: Push the branch**
+
+```bash
+git push -u origin feat/v0.7-middleware-docs
+```
+
+DO NOT open the PR yet — leave that to `finishing-a-development-branch`.
+
+---
+
+## Out of scope for this plan (per the spec)
+
+These items are deliberately deferred or retired. Do NOT do them in this PR:
+
+- **No source code changes.** Zero `src/` files modified. The protocol + decorators already exist and are public; this PR documents them.
+- **No CircuitBreaker / RateLimiter / custom-resilience worked example.** The non-resilience Request-ID example is intentional.
+- **No ResponseDecoder (Seam B) or optional-extras-pattern (Seam C) coverage.** Those stay in `engineering.md`.
+- **No mkdocs publish / docs-site infrastructure.** Epic 6 story `6-2`.
+- **No version bump in `pyproject.toml`.** Tag-driven; bump not required.
+- **No CLAUDE.md changes.**
+- **No new `# noqa` suppressions beyond `# noqa: A002` on the `next` parameter name** (matches `src/httpware/middleware/__init__.py` convention).
diff --git a/planning/plans/2026-06-05-v0.7-docs-expansion-plan.md b/planning/plans/2026-06-05-v0.7-docs-expansion-plan.md
new file mode 100644
index 0000000..f46acf1
--- /dev/null
+++ b/planning/plans/2026-06-05-v0.7-docs-expansion-plan.md
@@ -0,0 +1,948 @@
+# v0.7 docs expansion (Resilience + Errors + Testing + OTel wiring) Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Stack 9 docs-only commits onto the open PR #28 branch so 0.7.0 ships with a complete first-cut user-docs surface — `docs/resilience.md`, `docs/errors.md`, `docs/testing.md`, and an "OpenTelemetry wiring" section appended to `docs/middleware.md` — plus the nav, index, engineering, release-notes, and PR-description touchups that tie everything together.
+
+**Architecture:** Docs-only PR. Three new markdown pages (~380 lines combined), one ~30-line append to an existing page, four small textual edits to existing files (`mkdocs.yml` nav, `docs/index.md` Where-to-go-next, `planning/engineering.md` §8, `planning/releases/0.7.0.md` rewrite), one `gh pr edit` to update PR #28. Zero source code changes. Verification: `mkdocs build --strict` + link scan + the existing test/lint suites as no-op confirmation.
+
+**Tech Stack:** Markdown, mkdocs-material (strict build), `gh` CLI. No source code.
+
+**Target branch:** `feat/v0.7-middleware-docs` — the branch with PR #28 open. **Do NOT create a new branch.** The new commits stack on top of the existing 6 plus the two spec commits.
+
+**Source spec:** [`planning/specs/2026-06-05-v0.7-docs-expansion-design.md`](../specs/2026-06-05-v0.7-docs-expansion-design.md). Read its "Deliverable" section for the page-by-page rationale (why this nav order, why OTel is a section not a page, why the Request-ID example sits where it does).
+
+---
+
+## File structure
+
+**New files:**
+- `docs/resilience.md` — Retry / RetryBudget / Bulkhead reference (~180 lines)
+- `docs/errors.md` — exception tree, status-mapping, catching strategies (~120 lines)
+- `docs/testing.md` — `httpx2.MockTransport` injection pattern (~80 lines)
+
+**Modified files:**
+- `docs/middleware.md` — append "Wiring OpenTelemetry" section
+- `mkdocs.yml` — three new nav entries
+- `docs/index.md` — three new "Where to go next" bullets + 1 amended bullet
+- `planning/engineering.md` §8 — append a sub-bullet to Epic 3's "Shipped in v0.7" line
+- `planning/releases/0.7.0.md` — rewrite (title + body)
+- PR #28 title + body — via `gh pr edit` after the new commits push
+
+**Commit cadence:** one commit per task. Per-task commits keep history reviewable and make a per-page revert trivial if needed.
+
+---
+
+## Task 1: Append "Wiring OpenTelemetry" section to `docs/middleware.md`
+
+**Why first:** the index.md Where-to-go-next change (Task 6) references "and OpenTelemetry wiring" in the Middleware-guide bullet, so the section it points to has to exist first.
+
+**Files:**
+- Modify: `docs/middleware.md`
+
+- [ ] **Step 1: Insert the new section**
+
+In `docs/middleware.md`, find this anchor (the start of the existing "See also" section near the end of the file):
+```markdown
+## See also
+
+- **`planning/engineering.md` §3 (Seam A)** — the formal protocol contract and why the chain is frozen at construction.
+```
+
+Insert this new section IMMEDIATELY BEFORE that `## See also` heading (so the new section is sandwiched between "When NOT to write a middleware" and "See also"):
+
+````markdown
+## Wiring OpenTelemetry
+
+`httpware[otel]` only ships `opentelemetry-api`. To make the observability events emitted by `Retry` and `Bulkhead` visible, you also need:
+
+- An **SDK** (`opentelemetry-sdk`) to actually collect spans
+- An **HTTP instrumentor** (`opentelemetry-instrumentation-httpx`) so each HTTP call creates a span — `httpware`'s events attach to that span via `trace.get_current_span().add_event(...)`
+
+Minimal setup (console exporter for development):
+
+```python
+from opentelemetry import trace
+from opentelemetry.sdk.trace import TracerProvider
+from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter
+from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor
+
+trace.set_tracer_provider(TracerProvider())
+trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
+HTTPXClientInstrumentor().instrument()
+```
+
+After this runs, every `httpware` HTTP call gets an `HTTP <method>` span from the instrumentor, and Retry/Bulkhead observability events appear as span events on it (no extra configuration needed in `httpware` itself — the events fire whenever an active span is present).
+
+For production, swap `ConsoleSpanExporter` for your OTLP/Jaeger/Zipkin exporter. See the [OpenTelemetry Python docs](https://opentelemetry.io/docs/languages/python/) for the full SDK setup.
+
+````
+
+- [ ] **Step 2: Verify mkdocs strict build is still clean**
+
+```bash
+uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -10
+rm -rf site/
+```
+Expected: `Documentation built in <time>` with no warnings. The new external link to `opentelemetry.io` is external; mkdocs strict doesn't validate external URLs.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add docs/middleware.md
+git commit -m "docs(middleware): add 'Wiring OpenTelemetry' section
+
+httpware[otel] only ships opentelemetry-api; to make Retry/Bulkhead
+observability events visible users also need an SDK + the
+opentelemetry-instrumentation-httpx instrumentor (so each HTTP call
+has an active span our events can attach to).
+
+Section sits between 'When NOT to write a middleware' and 'See also'.
+Minimal console-exporter setup for dev; pointer to OTel Python docs
+for production exporter wiring."
+```
+
+---
+
+## Task 2: Create `docs/resilience.md`
+
+**Files:**
+- Create: `docs/resilience.md`
+
+- [ ] **Step 1: Create the file with the full content below**
+
+````markdown
+# Resilience reference
+
+`httpware` ships three resilience primitives under `httpware.middleware.resilience`, all composable through the standard [Middleware](middleware.md) chain:
+
+- **`Retry`** — automatic retry of transient failures with full-jitter exponential backoff
+- **`RetryBudget`** — Finagle-style token bucket that bounds the global retry rate to prevent retry storms when downstreams degrade
+- **`Bulkhead`** — concurrency limiter via `asyncio.Semaphore` with bounded acquire-wait
+
+The canonical composition is `middleware=[Bulkhead(...), Retry()]` — `Bulkhead` outside `Retry` so one slot covers all retry attempts of a single call. Reach for the [Middleware guide](middleware.md) when you want to write your own resilience policy.
+
+## `Retry`
+
+```python
+from httpware.middleware.resilience import Retry
+```
+
+| Parameter | Default | Effect |
+|---|---|---|
+| `max_attempts` | `3` | Total tries (including the first). `1` disables retries entirely; `<1` raises `ValueError`. |
+| `base_delay` | `0.1` (s) | Floor for the full-jitter exponential backoff. |
+| `max_delay` | `5.0` (s) | Ceiling for backoff. |
+| `attempt_timeout` | `None` | If set, each individual attempt is wrapped in `asyncio.timeout(attempt_timeout)`. |
+| `retry_status_codes` | `frozenset({408, 429, 502, 503, 504})` | Status codes considered retryable. |
+| `retry_methods` | `frozenset({"GET", "HEAD", "OPTIONS", "PUT", "DELETE"})` | Idempotent methods only by default. POST excluded; pass an explicit frozenset including `"POST"` to retry it. |
+| `respect_retry_after` | `True` | When the response carries a `Retry-After` header on a retryable status, sleep for the header value (clamped to `max_delay`) instead of the jittered backoff. |
+| `budget` | `RetryBudget()` (default-configured) | The token bucket. Pass a shared `RetryBudget` instance to apply one budget across multiple clients. |
+
+### Retry-After parsing
+
+`Retry-After` is parsed as either:
+- **Integer seconds** — `Retry-After: 30` → sleep 30s (clamped to `max_delay`)
+- **HTTP-date** (RFC 5322) — `Retry-After: Wed, 21 Oct 2026 07:28:00 GMT` → sleep until that absolute time (clamped to `max_delay`, floored at 0)
+
+Negative integer values are clamped to 0. Malformed values are ignored, falling back to the jittered backoff.
+
+### Streaming-body refusal
+
+If the request body was an async-iterable, `Retry` refuses to retry — the iterator is consumed after the first attempt and can't replay. The original exception is re-raised with a PEP 678 note:
+
+```
+httpware: not retrying — request body is a stream that cannot replay across attempts
+```
+
+The same refusal note is added at the non-idempotent early-exit sites (when streaming combines with a non-idempotent method). The observability event `httpware.retry` `retry.streaming_refused` fires only at the retryable-failure-path site — see [Observability](index.md#observability).
+
+### Exhaustion behavior
+
+On exhaustion, `Retry` re-raises the *last* exception observed (e.g., `ServiceUnavailableError`, `NetworkError`), preserving the original class so `except ServiceUnavailableError` still catches it. A PEP 678 note is added: `httpware: gave up after N attempts`.
+
+If exhaustion is caused by the budget refusing a retry (not by `max_attempts`), the raised exception is `RetryBudgetExhaustedError` instead, with `last_response` / `last_exception` / `attempts` fields populated. See [Errors reference](errors.md).
+
+## `RetryBudget`
+
+```python
+from httpware.middleware.resilience import RetryBudget
+```
+
+A Finagle-style token bucket bounding retry rate. Each request deposits a token; each retry attempts to withdraw one. Available retries are bounded by `percent_can_retry` of recent deposits, plus a `min_retries_per_sec * ttl` floor.
+
+| Parameter | Default | Effect |
+|---|---|---|
+| `ttl` | `10.0` (s) | Sliding window over which deposits and withdrawals count. |
+| `min_retries_per_sec` | `10.0` | Absolute floor — at least this many retries/sec are permitted regardless of deposit rate. |
+| `percent_can_retry` | `0.2` | Fraction of recent deposits that can convert to retries (above the floor). |
+
+### The token-bucket formula
+
+```
+ceiling = int(len(deposits_in_window) * percent_can_retry) + int(min_retries_per_sec * ttl)
+```
+
+A withdrawal fails when `len(withdrawn_in_window) >= ceiling`.
+
+### Why a floor matters
+
+If the deposit rate is zero (no traffic yet), the percent term is zero — without the floor, the very first retry would be refused. The floor lets small-traffic clients still retry on isolated failures; high-traffic clients are dominated by the percent term and the floor becomes irrelevant.
+
+### Sharing across clients
+
+Pass the same `RetryBudget` instance to multiple `AsyncClient`s when they hit the same downstream — one joint budget covers them all:
+
+```python
+import asyncio
+
+from httpware import AsyncClient
+from httpware.middleware.resilience import Retry, RetryBudget
+
+
+shared = RetryBudget()
+
+
+async def main() -> None:
+    async with (
+        AsyncClient(base_url="https://api.example.com", middleware=[Retry(budget=shared)]) as users,
+        AsyncClient(base_url="https://api.example.com", middleware=[Retry(budget=shared)]) as orders,
+    ):
+        await asyncio.gather(users.get("/users/1"), orders.get("/orders/1"))
+```
+
+### Single-thread assumption
+
+`RetryBudget` is asyncio-aware — deque mutations between await points are atomic on a single event loop. Cross-thread use is out of scope; if you need that, wrap calls in a lock yourself.
+
+## `Bulkhead`
+
+```python
+from httpware.middleware.resilience import Bulkhead
+```
+
+Concurrency limiter via `asyncio.Semaphore`. Acquires a slot before each request (bounded by `acquire_timeout`); releases on success, exception, AND cancellation.
+
+| Parameter | Default | Effect |
+|---|---|---|
+| `max_concurrent` | **REQUIRED** | Maximum in-flight requests. `<1` raises `ValueError`. No default — the right cap depends on downstream capacity and SLA. |
+| `acquire_timeout` | `1.0` (s) | How long to wait for a slot before raising `BulkheadFullError`. `None` waits forever; `0` fails fast. `<0` raises `ValueError`. |
+
+### Slot release contract
+
+The slot is released in a `try/finally` around `await next(request)`, so all three exit paths release deterministically:
+- **Success** — slot released after the response returns
+- **Exception** — slot released before the exception propagates
+- **Cancellation** — slot released as the `CancelledError` propagates
+
+The slot cannot leak.
+
+### Sharing across clients
+
+Same pattern as `RetryBudget`. One instance, many clients:
+
+```python
+shared_bulkhead = Bulkhead(max_concurrent=10)
+
+async with (
+    AsyncClient(base_url="https://api.example.com", middleware=[shared_bulkhead, Retry()]) as a,
+    AsyncClient(base_url="https://api.example.com", middleware=[shared_bulkhead, Retry()]) as b,
+):
+    ...  # combined in-flight across a + b is capped at 10
+```
+
+### Rejection
+
+When `acquire_timeout` elapses without a slot opening, `Bulkhead` raises `BulkheadFullError` (carries the configured `max_concurrent` and `acquire_timeout` for caller logging). See [Errors reference](errors.md). The `httpware.bulkhead` `bulkhead.rejected` observability event fires at the same site — see [Observability](index.md#observability).
+
+## Composition
+
+The canonical ordering is `middleware=[Bulkhead, Retry]` — `Bulkhead` outermost so one slot covers all retry attempts of a single call:
+
+```python
+from httpware import AsyncClient
+from httpware.middleware.resilience import Bulkhead, Retry
+
+
+async def main() -> None:
+    async with AsyncClient(
+        base_url="https://api.example.com",
+        middleware=[
+            Bulkhead(max_concurrent=10),
+            Retry(),
+        ],
+    ) as client:
+        await client.get("/users/1")
+```
+
+Flipping the order (`[Retry, Bulkhead]`) means each retry attempt grabs a fresh slot — defeating the bulkhead under load. Don't do that.
+
+Cross-cutting middleware that emit per-call state (e.g., the Request-ID middleware in the [Middleware guide](middleware.md)) should sit outside `Retry` for the same reason — so all attempts of one call share one ID rather than getting a fresh ID per attempt.
+
+## See also
+
+- **[Middleware guide](middleware.md)** — write your own resilience middleware against the same protocol `Retry` and `Bulkhead` use.
+- **[Errors reference](errors.md)** — `RetryBudgetExhaustedError`, `BulkheadFullError`, and the broader exception tree.
+- **[Observability](index.md#observability)** — the four operational events these middleware emit.
+- **`planning/engineering.md` §3** — the formal Middleware/Seam-A contract.
+````
+
+- [ ] **Step 2: Verify mkdocs strict build is clean**
+
+```bash
+uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -10
+rm -rf site/
+```
+Expected: `Documentation built in <time>` with no warnings. Note: `resilience.md` is not yet in the nav (Task 5 adds it), but mkdocs in strict mode still indexes ALL `.md` files under `docs_dir` and warns about orphans. **If the build complains about `resilience.md` being unreferenced**, that's expected — proceed to commit anyway; Task 5 will add the nav entry and re-verify. Document the warning in your DONE report if it appears.
+
+(In practice mkdocs-material treats orphans as info-level, not warning, so strict mode passes — but flag it if you see otherwise.)
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add docs/resilience.md
+git commit -m "docs(resilience): write Retry/RetryBudget/Bulkhead reference
+
+New docs/resilience.md (~190 lines) — full parameter tables, defaults,
+Retry-After parsing rules, streaming-body refusal contract, exhaustion
+behavior, the token-bucket formula + why-the-floor-matters note,
+budget/bulkhead sharing across clients, composition guidance, and
+cross-references to Middleware guide, Errors reference, and the
+Observability section.
+
+No new built-in middleware. Documents what already shipped through
+v0.4 (Retry/RetryBudget/Bulkhead) and v0.6 (the observability events
+each emits)."
+```
+
+---
+
+## Task 3: Create `docs/errors.md`
+
+**Files:**
+- Create: `docs/errors.md`
+
+- [ ] **Step 1: Create the file with the full content below**
+
+````markdown
+# Errors reference
+
+`httpware` raises typed exceptions automatically — everything inherits `ClientError`, and HTTP responses with 4xx/5xx status raise status-keyed `StatusError` subclasses without you having to call `response.raise_for_status()`.
+
+For the resilience-specific errors (`RetryBudgetExhaustedError`, `BulkheadFullError`) see the [Resilience reference](resilience.md).
+
+## The exception tree
+
+```
+ClientError                          (catch-all for anything httpware raises)
+├── TransportError                   (connection/network/protocol failure pre-response)
+│   └── NetworkError                 (transient — safe to retry; covered by Retry's defaults)
+├── TimeoutError                     (also inherits builtins.TimeoutError — except OSError catches it)
+├── StatusError                      (got a response but its status was 4xx/5xx)
+│   ├── ClientStatusError            (any 4xx — fallback for unknown 4xx codes)
+│   │   ├── BadRequestError          (400)
+│   │   ├── UnauthorizedError        (401)
+│   │   ├── ForbiddenError           (403)
+│   │   ├── NotFoundError            (404)
+│   │   ├── ConflictError            (409)
+│   │   ├── UnprocessableEntityError (422)
+│   │   └── RateLimitedError         (429)
+│   └── ServerStatusError            (any 5xx — fallback for unknown 5xx codes)
+│       ├── InternalServerError     (500)
+│       └── ServiceUnavailableError (503)
+├── RetryBudgetExhaustedError       (a retry was needed but the budget refused)
+└── BulkheadFullError                (acquire_timeout elapsed before a slot opened)
+```
+
+## Status-to-exception mapping
+
+| Status | Exception class |
+|---|---|
+| 400 | `BadRequestError` |
+| 401 | `UnauthorizedError` |
+| 403 | `ForbiddenError` |
+| 404 | `NotFoundError` |
+| 409 | `ConflictError` |
+| 422 | `UnprocessableEntityError` |
+| 429 | `RateLimitedError` |
+| 500 | `InternalServerError` |
+| 503 | `ServiceUnavailableError` |
+| other 4xx | `ClientStatusError` (fallback) |
+| other 5xx | `ServerStatusError` (fallback) |
+
+The fallback assumes `400 ≤ status < 600`. Statuses outside that range don't raise (they return the response as-is).
+
+## Catching strategies
+
+```python
+from httpware import (
+    AsyncClient,
+    ClientError,
+    StatusError,
+    NetworkError,
+    TimeoutError,
+    NotFoundError,
+    RetryBudgetExhaustedError,
+    BulkheadFullError,
+)
+
+
+async def fetch(client: AsyncClient, user_id: int) -> dict | None:
+    try:
+        return await client.get(f"/users/{user_id}", response_model=dict)
+    except NotFoundError:
+        # Specific status — most precise. Convert to None as the "absent" sentinel.
+        return None
+    except StatusError as exc:
+        # Got a response, but its status was 4xx/5xx and not one we handle specifically.
+        # exc.response.* is available — headers, content, request, etc.
+        _LOGGER.warning("upstream returned %s for %s", exc.response.status_code, exc.response.request.url)
+        raise
+    except NetworkError:
+        # Transient transport failure. Already retried by the default Retry middleware
+        # (if installed) when the method was idempotent. Seeing this means retries
+        # exhausted or the method was non-idempotent.
+        raise
+    except (RetryBudgetExhaustedError, BulkheadFullError) as exc:
+        # Resilience refusal — backpressure signal. Back off the caller.
+        _LOGGER.error("resilience refused: %s", exc)
+        raise
+    except ClientError:
+        # Catch-all for anything else httpware raised.
+        raise
+```
+
+`TimeoutError` is doubly-inherited: `except builtins.TimeoutError` and `except OSError` both catch it (matches what `asyncio.wait_for` raises). This lets stdlib-style timeout handling Just Work.
+
+## `exc.response.*` access pattern
+
+For any `StatusError` subclass, the raw `httpx2.Response` is on `exc.response`:
+
+```python
+exc.response.status_code     # 404
+exc.response.headers          # httpx2.Headers — case-insensitive
+exc.response.content          # raw bytes
+exc.response.text             # decoded body
+exc.response.json()           # parsed JSON (raises if not JSON)
+exc.response.request          # the failing httpx2.Request
+exc.response.request.url      # the failing URL (httpx2.URL)
+exc.response.request.method   # the HTTP method
+```
+
+**Security note:** `__repr__` and the exception's summary message strip `user:pass@` userinfo from the URL to avoid leaking credentials in tracebacks. **Query-string secrets are NOT stripped** — keep secrets out of query strings.
+
+## Resilience-error payloads
+
+`RetryBudgetExhaustedError` carries:
+- `last_response: httpx2.Response | None` — the last response observed before the budget refused (None if all failures were transport-level)
+- `last_exception: BaseException | None` — the last exception observed before the budget refused
+- `attempts: int` — number of attempts already completed
+
+`BulkheadFullError` carries:
+- `max_concurrent: int` — the configured cap
+- `acquire_timeout: float | None` — the configured timeout
+
+Use these for caller-side logging / alerting:
+
+```python
+except RetryBudgetExhaustedError as exc:
+    _LOGGER.error(
+        "budget exhausted after %d attempts; last_status=%s",
+        exc.attempts,
+        exc.last_response.status_code if exc.last_response is not None else None,
+    )
+```
+
+## See also
+
+- **[Resilience reference](resilience.md)** — `Retry`, `RetryBudget`, `Bulkhead` parameter tables.
+- **[Middleware guide](middleware.md)** — the `@on_error` decorator can translate exceptions into responses.
+- **`planning/engineering.md` §4** — the formal exception contract.
+````
+
+- [ ] **Step 2: Verify mkdocs strict build is clean**
+
+```bash
+uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -10
+rm -rf site/
+```
+Expected: `Documentation built in <time>`. Same orphan-page caveat as Task 2.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add docs/errors.md
+git commit -m "docs(errors): write exception tree and catching strategies reference
+
+New docs/errors.md (~130 lines) — full StatusError hierarchy as an
+ASCII tree, status-to-exception mapping table, practical catching
+patterns (specific status -> StatusError -> NetworkError -> resilience
+errors -> ClientError catch-all), exc.response.* access pattern with
+the userinfo-stripping security note, and the payloads on
+RetryBudgetExhaustedError / BulkheadFullError for caller-side logging.
+
+No new exception classes. Documents what already shipped through
+v0.4 (resilience errors) and v0.2 (status-keyed tree)."
+```
+
+---
+
+## Task 4: Create `docs/testing.md`
+
+**Files:**
+- Create: `docs/testing.md`
+
+- [ ] **Step 1: Create the file with the full content below**
+
+````markdown
+# Testing guide
+
+`httpware`'s test seam is `httpx2`. Pass any `httpx2.AsyncClient` (including one built on `httpx2.MockTransport`) to `AsyncClient(httpx2_client=...)` — the middleware chain still runs end-to-end, only the wire is mocked. No special test mode, no monkey-patching, no `respx`.
+
+## The basic pattern
+
+```python
+from http import HTTPStatus
+
+import httpx2
+
+from httpware import AsyncClient
+
+
+def handler(request: httpx2.Request) -> httpx2.Response:
+    return httpx2.Response(HTTPStatus.OK, json={"id": 1, "name": "Alice"})
+
+
+async def test_get_user() -> None:
+    transport = httpx2.MockTransport(handler)
+    async with AsyncClient(httpx2_client=httpx2.AsyncClient(transport=transport)) as client:
+        response = await client.get("https://api.example.test/users/1")
+    assert response.status_code == HTTPStatus.OK
+    assert response.json()["name"] == "Alice"
+```
+
+The handler can be sync or async; `httpx2.MockTransport` supports both. The test above uses a sync handler.
+
+If you use `pytest-asyncio` in auto-mode (`asyncio_mode = "auto"` under `[tool.pytest.ini_options]`), async test functions don't need the `@pytest.mark.asyncio` decorator.
+
+## Recording / stateful handlers
+
+For tests that need to vary the response by call count or assert on the requests that came in, use a handler with instance state:
+
+```python
+class _ResponseSequence:
+    """Returns each status in order; records every request received."""
+
+    def __init__(self, statuses: list[int]) -> None:
+        self._statuses = list(statuses)
+        self.calls: list[httpx2.Request] = []
+
+    def __call__(self, request: httpx2.Request) -> httpx2.Response:
+        self.calls.append(request)
+        status = self._statuses.pop(0) if self._statuses else HTTPStatus.OK
+        return httpx2.Response(status, request=request)
+
+
+async def test_retry_succeeds_after_503() -> None:
+    handler = _ResponseSequence([HTTPStatus.SERVICE_UNAVAILABLE, HTTPStatus.OK])
+    transport = httpx2.MockTransport(handler)
+    async with AsyncClient(
+        httpx2_client=httpx2.AsyncClient(transport=transport),
+        middleware=[Retry(base_delay=0.001, max_delay=0.002)],
+    ) as client:
+        response = await client.get("https://example.test/x")
+    assert response.status_code == HTTPStatus.OK
+    assert len(handler.calls) == 2  # initial + 1 retry
+```
+
+The `base_delay`/`max_delay` are set tiny so the test runs instantly — no need for `freezegun` or sleep injection in most cases.
+
+## Testing your custom middleware
+
+Compose your middleware with the mock transport to exercise the chain end-to-end:
+
+```python
+async def test_my_middleware_adds_header() -> None:
+    handler = _ResponseSequence([HTTPStatus.OK])
+    async with AsyncClient(
+        httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)),
+        middleware=[MyHeaderMiddleware()],
+    ) as client:
+        await client.get("https://example.test/x")
+    assert handler.calls[0].headers["X-My-Header"] == "expected-value"
+```
+
+For middleware with state-keeping (counters, circuit-breaker state), assert on instance attributes after running the call.
+
+## Why not `respx`?
+
+`httpware` deliberately uses `httpx2.MockTransport` instead of `respx` for its own tests. `MockTransport` is the public test seam in `httpx` — supported by the maintainers, stable across versions, lives in the public API surface. `respx` patches private internals and has historically broken across `httpx` major versions. Stick with `MockTransport` unless you have a specific reason not to.
+
+## See also
+
+- **[Middleware guide](middleware.md)** — write the middleware you're testing.
+- **[Resilience reference](resilience.md)** — testing `Retry`/`Bulkhead` configurations.
+- **`planning/engineering.md` §6** — the project's own testing patterns (Hypothesis property-based tests, `pytest-asyncio` auto-mode, the `RecordedTransport`-was-removed history).
+````
+
+- [ ] **Step 2: Verify mkdocs strict build is clean**
+
+```bash
+uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -10
+rm -rf site/
+```
+Expected: `Documentation built in <time>`. Same orphan caveat.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add docs/testing.md
+git commit -m "docs(testing): write mock-transport injection pattern guide
+
+New docs/testing.md (~90 lines) — the httpx2.MockTransport pattern
+that the project's own tests use; instance-state handler for
+stateful responses (response sequences, request recording); composing
+custom middleware with the mock transport for end-to-end tests; brief
+'why not respx' note pointing at the private-internals risk.
+
+No code changes. Documents the test pattern that has been in tests/
+since v0.2 but never user-facing."
+```
+
+---
+
+## Task 5: Update `mkdocs.yml` nav
+
+**Files:**
+- Modify: `mkdocs.yml`
+
+- [ ] **Step 1: Replace the nav block**
+
+The current `mkdocs.yml` nav (after the prior 0.7 commits) reads:
+```yaml
+nav:
+  - Quick-Start: index.md
+  - Middleware: middleware.md
+  - Development:
+      - Contributing: dev/contributing.md
+```
+
+Replace with:
+```yaml
+nav:
+  - Quick-Start: index.md
+  - Resilience: resilience.md
+  - Middleware: middleware.md
+  - Errors: errors.md
+  - Testing: testing.md
+  - Development:
+      - Contributing: dev/contributing.md
+```
+
+Order rationale: Resilience precedes Middleware because most users will *use* the built-ins (`Retry`, `Bulkhead`) before they *write* their own. Errors and Testing follow as reference + setup-friction pages.
+
+- [ ] **Step 2: Verify mkdocs strict build is clean**
+
+```bash
+uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -10
+rm -rf site/
+```
+Expected: `Documentation built in <time>` with no warnings. All three new pages now have nav entries — any orphan warnings from Tasks 2-4 disappear.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add mkdocs.yml
+git commit -m "docs(nav): add Resilience / Errors / Testing pages to mkdocs nav
+
+Six top-level entries after this:
+  Quick-Start, Resilience, Middleware, Errors, Testing, Development
+
+Resilience precedes Middleware because most users reach for the
+built-in Retry/Bulkhead before writing their own. Errors and Testing
+follow as reference + setup-friction pages."
+```
+
+---
+
+## Task 6: Update `docs/index.md` "Where to go next" + amend Middleware bullet
+
+**Files:**
+- Modify: `docs/index.md`
+
+The prior 0.7 commit `61306fc` added a Middleware-guide bullet as the first entry in "Where to go next". This task adds three more bullets and extends the existing Middleware bullet with `and OpenTelemetry wiring` (since Task 1 added that section).
+
+- [ ] **Step 1: Replace the "Where to go next" block**
+
+Find this current block (around L107-L112):
+```markdown
+## Where to go next
+
+- **[Middleware guide](middleware.md)** — write your own middleware. Covers the Middleware Protocol, the phase decorators, and a worked Request-ID propagation example.
+- **[Engineering Notes](https://github.com/modern-python/httpware/blob/main/planning/engineering.md)** — design invariants, the three protocol seams, exception contract, module layout, testing patterns, optional-extras pattern. Lives in the repo at `planning/engineering.md`.
+- **[Contributing](dev/contributing.md)** — setup, conventions, workflow.
+- **[Release notes](https://github.com/modern-python/httpware/releases)** — per-version changelogs.
+```
+
+Replace with:
+```markdown
+## Where to go next
+
+- **[Resilience reference](resilience.md)** — every parameter on `Retry`, `RetryBudget`, and `Bulkhead`; the retry-rule matrix; Retry-After parsing; budget sharing.
+- **[Middleware guide](middleware.md)** — write your own middleware. Covers the Middleware Protocol, the phase decorators, a worked Request-ID propagation example, and OpenTelemetry wiring.
+- **[Errors reference](errors.md)** — the full exception tree, catching strategies, `exc.response.*` access pattern.
+- **[Testing guide](testing.md)** — mock-transport injection pattern for testing code that uses `httpware`.
+- **[Engineering Notes](https://github.com/modern-python/httpware/blob/main/planning/engineering.md)** — design invariants, the three protocol seams, exception contract, module layout, testing patterns, optional-extras pattern. Lives in the repo at `planning/engineering.md`.
+- **[Contributing](dev/contributing.md)** — setup, conventions, workflow.
+- **[Release notes](https://github.com/modern-python/httpware/releases)** — per-version changelogs.
+```
+
+Three new bullets at the top (Resilience, Errors, Testing), the existing Middleware bullet amended with `and OpenTelemetry wiring`, the Engineering/Contributing/Release-notes bullets unchanged.
+
+- [ ] **Step 2: Verify mkdocs strict build is clean**
+
+```bash
+uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -10
+rm -rf site/
+```
+Expected: clean. All four internal links resolve (resilience.md, middleware.md, errors.md, testing.md).
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add docs/index.md
+git commit -m "docs(index): expand Where-to-go-next with Resilience / Errors / Testing
+
+Three new bullets (Resilience reference, Errors reference, Testing
+guide), plus an addendum to the existing Middleware bullet noting the
+new OpenTelemetry wiring section. Engineering / Contributing /
+Release-notes bullets unchanged."
+```
+
+---
+
+## Task 7: Update `planning/engineering.md` §8 — enrich Epic 3 SHIPPED note
+
+**Files:**
+- Modify: `planning/engineering.md`
+
+The prior 0.7 commit `07ac068` recorded `3-6` as shipped in v0.7 and marked Epic 3 closed. This task adds a sub-bullet noting that v0.7 also bundles the rest of the first-cut user docs surface.
+
+- [ ] **Step 1: Insert a new sub-bullet under the existing Epic 3 "Shipped in v0.7" line**
+
+Find the current Epic 3 block (around L131-L135):
+```markdown
+- **Epic 3 — Resilience:**
+  - **Shipped in v0.4 slice 1:** `Retry` middleware + Finagle-style `RetryBudget` token bucket + `attempt_timeout=` parameter (folded-in 3-1). See [`planning/specs/2026-06-05-retry-and-retry-budget-design.md`](specs/2026-06-05-retry-and-retry-budget-design.md) and [`planning/plans/2026-06-05-retry-and-retry-budget-plan.md`](plans/2026-06-05-retry-and-retry-budget-plan.md).
+  - **Shipped in v0.4 slice 2:** `Bulkhead` middleware (concurrency limiter via `asyncio.Semaphore` with bounded acquire wait). See [`planning/specs/2026-06-05-bulkhead-design.md`](specs/2026-06-05-bulkhead-design.md) and [`planning/plans/2026-06-05-bulkhead-plan.md`](plans/2026-06-05-bulkhead-plan.md).
+  - **Shipped in v0.7:** `3-6` extension-slot docs — [`docs/middleware.md`](../docs/middleware.md). Covers the Middleware Protocol, phase decorators, a Request-ID worked example, and "when NOT to write a middleware." See [`planning/specs/2026-06-05-extension-slot-docs-design.md`](specs/2026-06-05-extension-slot-docs-design.md) and [`planning/plans/2026-06-05-extension-slot-docs-plan.md`](plans/2026-06-05-extension-slot-docs-plan.md).
+  - **Epic 3 closed.**
+```
+
+Insert this new sub-bullet between the existing "Shipped in v0.7" line and the "Epic 3 closed." line:
+```markdown
+  - **v0.7 also bundles** the rest of the first-cut user docs surface — [`docs/resilience.md`](../docs/resilience.md) (Retry/RetryBudget/Bulkhead reference), [`docs/errors.md`](../docs/errors.md) (exception tree + catching strategies), [`docs/testing.md`](../docs/testing.md) (mock-transport injection pattern) — plus an "OpenTelemetry wiring" section appended to `docs/middleware.md`. See [`planning/specs/2026-06-05-v0.7-docs-expansion-design.md`](specs/2026-06-05-v0.7-docs-expansion-design.md) and [`planning/plans/2026-06-05-v0.7-docs-expansion-plan.md`](plans/2026-06-05-v0.7-docs-expansion-plan.md).
+```
+
+So the Epic 3 block becomes a six-line list — the two v0.4 slices, the v0.7 extension-slot-docs line, the new v0.7-also-bundles line, and the "Epic 3 closed." closer.
+
+- [ ] **Step 2: Commit**
+
+```bash
+git add planning/engineering.md
+git commit -m "docs(engineering): note v0.7 also bundled the rest of user-docs surface
+
+Adds a sub-bullet under Epic 3's existing 'Shipped in v0.7' line
+calling out docs/resilience.md, docs/errors.md, docs/testing.md, and
+the OpenTelemetry-wiring section appended to docs/middleware.md. Links
+to the expansion spec and plan."
+```
+
+---
+
+## Task 8: Rewrite `planning/releases/0.7.0.md`
+
+**Files:**
+- Modify: `planning/releases/0.7.0.md` (full rewrite)
+
+The prior 0.7 commit `b0aac27` wrote the release notes scoped to just the Middleware guide. This task rewrites them to cover the expanded scope. The GitHub Release will be created from this file after merge, so the new content must stand alone as user-facing release notes.
+
+- [ ] **Step 1: Replace the entire file contents**
+
+Overwrite `planning/releases/0.7.0.md` with this content:
+
+````markdown
+# httpware 0.7.0 — First-cut user docs (docs-only)
+
+**0.7.0 is a docs-only release. No API changes.** Code written against 0.6.0 continues to work unchanged.
+
+This release ships the first-cut user-facing documentation surface — every shipped feature through 0.6 now has a user-facing reference page, and the two highest-friction adoption recipes (test-mocking and OpenTelemetry wiring) are concrete. Epic 3 (Resilience) closes with this release.
+
+## What's new
+
+Four new docs deliverables on the docs site:
+
+- **[`docs/middleware.md`](../../docs/middleware.md)** — write your own middleware against `httpware.middleware.Middleware` and `Next`. Covers the protocol, the phase decorators (`@before_request`, `@after_response`, `@on_error`), a worked `RequestIdMiddleware` example, a "when NOT to write a middleware" section, **and an "OpenTelemetry wiring" section** with a minimal SDK + `opentelemetry-instrumentation-httpx` setup that makes the 0.6.0 Retry/Bulkhead observability events visible as span events.
+- **[`docs/resilience.md`](../../docs/resilience.md)** — deep-dive reference for `Retry`, `RetryBudget`, and `Bulkhead`: every parameter with its default and effect, the retry-rule matrix (status codes × methods), Retry-After parsing, streaming-body refusal contract, the token-bucket formula, why the floor matters, budget/bulkhead sharing across clients, and composition guidance.
+- **[`docs/errors.md`](../../docs/errors.md)** — the full `StatusError` hierarchy as an ASCII tree, the status-to-exception mapping table, practical catching strategies (specific status → `StatusError` → `NetworkError` → resilience errors → `ClientError` catch-all), the `exc.response.*` access pattern with the userinfo-stripping security note, and the payloads on `RetryBudgetExhaustedError` / `BulkheadFullError` for caller-side logging.
+- **[`docs/testing.md`](../../docs/testing.md)** — the `httpx2.MockTransport` injection pattern via `AsyncClient(httpx2_client=...)`. Recording/stateful handlers, testing custom middleware end-to-end, brief "why not respx" note pointing at the private-internals risk.
+
+Plus discovery: three new mkdocs nav entries (Resilience, Errors, Testing), four new bullets in `docs/index.md` "Where to go next", and engineering notes updated.
+
+## What's not in this release
+
+- **No source code changes.** The Middleware protocol, phase decorators, resilience primitives, exception tree, and test-transport seam all already existed; this release documents them.
+- **No new built-in middleware.** No CircuitBreaker, no RateLimiter, no auth helpers.
+- **No API autodoc** (e.g., mkdocstrings). Hand-written user docs only.
+- **No benchmarks page, no migration guide, no speculative cookbook recipes.** Reference pages for shipped features + concrete adoption recipes only.
+- **No mkdocs publish workflow / docs-site infrastructure.** That's Epic 6 (story `6-2`); this release just keeps `mkdocs build --strict` green.
+
+## Epic 3 closed
+
+Epic 3 (Resilience) has shipped end-to-end:
+- v0.4 slice 1 — `Retry` + `RetryBudget` + `attempt_timeout=`
+- v0.4 slice 2 — `Bulkhead`
+- v0.7 — `3-6` extension-slot docs + the rest of the first-cut user-docs surface
+
+Remaining roadmap is Epic 6 (ship v1.0): `6-2` docs site infrastructure (mkdocs publishing, hand-written content only — no autodoc), and `6-5` release flow (Trusted Publishers + Sigstore).
+
+## References
+
+- Middleware spec: [`planning/specs/2026-06-05-extension-slot-docs-design.md`](../specs/2026-06-05-extension-slot-docs-design.md)
+- Docs-expansion spec: [`planning/specs/2026-06-05-v0.7-docs-expansion-design.md`](../specs/2026-06-05-v0.7-docs-expansion-design.md)
+- Middleware plan: [`planning/plans/2026-06-05-extension-slot-docs-plan.md`](../plans/2026-06-05-extension-slot-docs-plan.md)
+- Docs-expansion plan: [`planning/plans/2026-06-05-v0.7-docs-expansion-plan.md`](../plans/2026-06-05-v0.7-docs-expansion-plan.md)
+- Roadmap: [`planning/engineering.md`](../engineering.md) §8
+````
+
+- [ ] **Step 2: Commit**
+
+```bash
+git add planning/releases/0.7.0.md
+git commit -m "docs(release): rewrite 0.7.0 notes for expanded docs scope
+
+The prior release notes covered just the Middleware guide. This
+rewrite covers the full first-cut user-docs surface that 0.7 actually
+ships:
+- docs/middleware.md (incl. new OTel-wiring section)
+- docs/resilience.md (Retry/RetryBudget/Bulkhead reference)
+- docs/errors.md (exception tree + catching strategies)
+- docs/testing.md (mock-transport pattern)
+
+Title changes from 'Middleware extension guide' to 'First-cut user
+docs'. 'What's not in this release' enriched with the autodoc /
+benchmarks / migration-guide / cookbook out-of-scope items per the
+project's docs philosophy."
+```
+
+---
+
+## Task 9: Final verification + push + update PR #28
+
+**Files:** none modified by edits; only verification + remote updates.
+
+- [ ] **Step 1: Lint-ci (sanity)**
+
+```bash
+just lint-ci
+```
+Expected: clean. No source code changes, so this is a pure no-op confirmation.
+
+- [ ] **Step 2: Full test suite (sanity)**
+
+```bash
+just test
+```
+Expected: 251 passed, 100% coverage. No source code changes, so identical to the prior run.
+
+- [ ] **Step 3: mkdocs strict build**
+
+```bash
+uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -20
+rm -rf site/
+```
+Expected: `Documentation built in <time>` with zero warnings. Every internal link in the new pages must resolve. The red `×` lines from mkdocs-material are plugin self-notices, not strict-build failures — the pass signal is `Documentation built in <time>`.
+
+- [ ] **Step 4: Cross-reference scan**
+
+```bash
+grep -nE '\]\(' docs/resilience.md docs/errors.md docs/testing.md docs/middleware.md
+```
+
+Expected: every link target is either:
+- A docs-internal anchor (e.g., `middleware.md`, `index.md#observability`, `errors.md`) — already verified by mkdocs strict
+- A clearly-external URL (`opentelemetry.io/...`)
+- A repo path used as prose reference (`planning/engineering.md`) — not a link target
+
+If a link points to a missing anchor (e.g., `middleware.md#nonexistent`), mkdocs strict would have caught it in Step 3.
+
+- [ ] **Step 5: Architecture invariants (sanity)**
+
+```bash
+grep -rE 'httpx2\._' src/httpware/ || echo "PASS: no httpx2 private API"
+grep -rE 'from __future__ import annotations' src/httpware/ || echo "PASS: no __future__ annotations"
+grep -rE '\bprint\(' src/httpware/ || echo "PASS: no print()"
+grep -rE 'logging\.(basicConfig|getLogger)\(\)' src/httpware/ || echo "PASS: no global logging"
+grep -rE '# (type|mypy): ignore' src/httpware/ || echo "PASS: no type/mypy ignore"
+```
+Each should print PASS. (Docs-only — no source files touched.)
+
+- [ ] **Step 6: Push the new commits**
+
+```bash
+git push origin feat/v0.7-middleware-docs
+```
+
+Expected: 8 new commits pushed (Tasks 1-8). PR #28 picks them up automatically.
+
+- [ ] **Step 7: Update PR #28 title + body**
+
+```bash
+gh pr edit 28 --title "feat(v0.7): first-cut user docs — Middleware + Resilience + Errors + Testing (closes Epic 3)" --body "$(cat <<'EOF'
+## Summary
+
+Closes Epic 3 (Resilience). Ships the first-cut user-facing documentation surface — every shipped feature through 0.6 now has a user-facing reference page, and the two highest-friction adoption recipes (test-mocking and OpenTelemetry wiring) are concrete.
+
+- **New `docs/middleware.md`** — write your own middleware against the protocol. Covers the protocol, the phase decorators, a worked Request-ID propagation example, a "when NOT to write a middleware" section, **and an OpenTelemetry wiring section** (SDK + opentelemetry-instrumentation-httpx setup).
+- **New `docs/resilience.md`** — Retry/RetryBudget/Bulkhead parameter tables + retry-rule matrix + Retry-After parsing + streaming-body refusal contract + token-bucket formula + budget sharing + composition guidance.
+- **New `docs/errors.md`** — full StatusError tree + status-to-exception mapping + catching strategies + exc.response.* access pattern + resilience-error payloads.
+- **New `docs/testing.md`** — \`httpx2.MockTransport\` injection pattern + recording handlers + testing custom middleware + why not respx.
+- **Discovery:** mkdocs nav (3 new entries), \`docs/index.md\` Where-to-go-next (3 new bullets + 1 amended), \`planning/engineering.md\` §8 (v0.7 SHIPPED note enriched), \`planning/releases/0.7.0.md\` rewritten to cover the expanded scope.
+- **Docs-only:** zero source files modified. The protocol, decorators, resilience primitives, exception tree, and test-transport seam all already existed (shipped through v0.6); this release documents them.
+
+Specs: [extension-slot](planning/specs/2026-06-05-extension-slot-docs-design.md), [docs expansion](planning/specs/2026-06-05-v0.7-docs-expansion-design.md)
+Plans: [extension-slot](planning/plans/2026-06-05-extension-slot-docs-plan.md), [docs expansion](planning/plans/2026-06-05-v0.7-docs-expansion-plan.md)
+Release notes: [planning/releases/0.7.0.md](planning/releases/0.7.0.md)
+
+## Test Plan
+
+- [x] \`just lint-ci\` — clean (no source files changed)
+- [x] \`just test\` — 251 passed, 100% coverage (no source files changed)
+- [x] \`mkdocs build --strict\` — clean across all 4 new/edited pages + nav + index touchups
+- [x] Architecture invariants — no \`httpx2._\`, no \`__future__\` annotations, no \`print()\`, no global logging, no \`# type:\`/\`# mypy:\` ignores
+- [ ] Reviewer: spot-check the Retry-After / streaming-refusal / token-bucket-formula sections of \`docs/resilience.md\` against the actual implementation behavior (most likely place for doc drift)
+- [ ] Reviewer: confirm the OpenTelemetry wiring snippet actually produces visible span events with a real \`opentelemetry-sdk\` install — the minimal example claims so but isn't gated by any test
+- [ ] Reviewer: nav order — Resilience precedes Middleware (use built-ins before write your own). Comment if you think Middleware should come first
+
+🤖 Generated with [Claude Code](https://claude.com/claude-code)
+EOF
+)" 2>&1 | tail -3
+```
+
+Expected: PR title + body updated; URL printed.
+
+---
+
+## Out of scope for this plan (per the spec)
+
+These items are deliberately deferred or retired. Do NOT do them in this PR:
+
+- **No source code changes.** Zero `src/` files modified. The protocol + decorators + resilience primitives + exception tree all already exist; this PR documents them.
+- **No new built-in middleware.** No CircuitBreaker, no RateLimiter.
+- **No API autodoc / mkdocstrings.** Per the user-docs-philosophy memory.
+- **No benchmarks page, no migration guide, no speculative cookbook recipes.** Per the same memory.
+- **No dedicated `docs/tracing.md` page.** The OTel wire-up rides as a section of `docs/middleware.md` (Task 1).
+- **No mkdocs publish workflow / docs-site infrastructure.** Epic 6 story `6-2`.
+- **No version bump in `pyproject.toml`.** Tag-driven (`uv version $GITHUB_REF_NAME` overwrites at build).
+- **No CLAUDE.md changes.**
+- **No new branch.** All 8 new commits stack on top of the existing 6 + 2 spec commits on `feat/v0.7-middleware-docs`.
diff --git a/planning/releases/0.7.0.md b/planning/releases/0.7.0.md
new file mode 100644
index 0000000..ea1dc7b
--- /dev/null
+++ b/planning/releases/0.7.0.md
@@ -0,0 +1,41 @@
+# httpware 0.7.0 — First-cut user docs (docs-only)
+
+**0.7.0 is a docs-only release. No API changes.** Code written against 0.6.0 continues to work unchanged.
+
+This release ships the first-cut user-facing documentation surface — every shipped feature through 0.6 now has a user-facing reference page, and the two highest-friction adoption recipes (test-mocking and OpenTelemetry wiring) are concrete. Epic 3 (Resilience) closes with this release.
+
+## What's new
+
+Four new docs deliverables on the docs site:
+
+- **[`docs/middleware.md`](../../docs/middleware.md)** — write your own middleware against `httpware.middleware.Middleware` and `Next`. Covers the protocol, the phase decorators (`@before_request`, `@after_response`, `@on_error`), a worked `RequestIdMiddleware` example, a "when NOT to write a middleware" section, **and an "OpenTelemetry wiring" section** with a minimal SDK + `opentelemetry-instrumentation-httpx` setup that makes the 0.6.0 Retry/Bulkhead observability events visible as span events.
+- **[`docs/resilience.md`](../../docs/resilience.md)** — deep-dive reference for `Retry`, `RetryBudget`, and `Bulkhead`: every parameter with its default and effect, the retry-rule matrix (status codes × methods), Retry-After parsing, streaming-body refusal contract, the token-bucket formula, why the floor matters, budget/bulkhead sharing across clients, and composition guidance.
+- **[`docs/errors.md`](../../docs/errors.md)** — the full `StatusError` hierarchy as an ASCII tree, the status-to-exception mapping table, practical catching strategies (specific status → `StatusError` → `NetworkError` → resilience errors → `ClientError` catch-all), the `exc.response.*` access pattern with the userinfo-stripping security note, and the payloads on `RetryBudgetExhaustedError` / `BulkheadFullError` for caller-side logging.
+- **[`docs/testing.md`](../../docs/testing.md)** — the `httpx2.MockTransport` injection pattern via `AsyncClient(httpx2_client=...)`. Recording/stateful handlers, testing custom middleware end-to-end, brief "why not respx" note pointing at the private-internals risk.
+
+Plus discovery: three new mkdocs nav entries (Resilience, Errors, Testing), four new bullets in `docs/index.md` "Where to go next", and engineering notes updated.
+
+## What's not in this release
+
+- **No source code changes.** The Middleware protocol, phase decorators, resilience primitives, exception tree, and test-transport seam all already existed; this release documents them.
+- **No new built-in middleware.** No CircuitBreaker, no RateLimiter, no auth helpers.
+- **No API autodoc** (e.g., mkdocstrings). Hand-written user docs only.
+- **No benchmarks page, no migration guide, no speculative cookbook recipes.** Reference pages for shipped features + concrete adoption recipes only.
+- **No mkdocs publish workflow / docs-site infrastructure.** That's Epic 6 (story `6-2`); this release just keeps `mkdocs build --strict` green.
+
+## Epic 3 closed
+
+Epic 3 (Resilience) has shipped end-to-end:
+- v0.4 slice 1 — `Retry` + `RetryBudget` + `attempt_timeout=`
+- v0.4 slice 2 — `Bulkhead`
+- v0.7 — `3-6` extension-slot docs + the rest of the first-cut user-docs surface
+
+Remaining roadmap is Epic 6 (ship v1.0): `6-2` docs site infrastructure (mkdocs publishing, hand-written content only — no autodoc), and `6-5` release flow (Trusted Publishers + Sigstore).
+
+## References
+
+- Middleware spec: [`planning/specs/2026-06-05-extension-slot-docs-design.md`](../specs/2026-06-05-extension-slot-docs-design.md)
+- Docs-expansion spec: [`planning/specs/2026-06-05-v0.7-docs-expansion-design.md`](../specs/2026-06-05-v0.7-docs-expansion-design.md)
+- Middleware plan: [`planning/plans/2026-06-05-extension-slot-docs-plan.md`](../plans/2026-06-05-extension-slot-docs-plan.md)
+- Docs-expansion plan: [`planning/plans/2026-06-05-v0.7-docs-expansion-plan.md`](../plans/2026-06-05-v0.7-docs-expansion-plan.md)
+- Roadmap: [`planning/engineering.md`](../engineering.md) §8
diff --git a/planning/specs/2026-06-05-extension-slot-docs-design.md b/planning/specs/2026-06-05-extension-slot-docs-design.md
new file mode 100644
index 0000000..9f607dc
--- /dev/null
+++ b/planning/specs/2026-06-05-extension-slot-docs-design.md
@@ -0,0 +1,138 @@
+# Spec: Extension-slot docs (Epic 3 story 3-6)
+
+**Date:** 2026-06-05
+**Topic slug:** `extension-slot-docs`
+**Status:** drafted, awaiting user review
+**Target release:** 0.7.0 (docs-only minor)
+**Epic 3 stories closed:** 3-6 (the last leftover). Closes Epic 3 entirely.
+
+## Purpose
+
+Document `httpware`'s primary extension point — the **Middleware protocol** — as a user-facing page so library consumers can write their own cross-cutting middleware (request-ID propagation, auth header injection, custom resilience policies, structured tracing, etc.) without reading the source.
+
+This is the deferred-tutorial half of story 3-6. The docs-sync-0.4 pass (PR #25) shipped the freshness fixes and explicitly punted "_write your own middleware_" walkthrough to a future docs PR. This is that PR.
+
+## Background — how 3-6 got here
+
+- **Original framing (pre-pivot):** "Document the extension slot for custom resilience policies." A tutorial framed around hand-rolling CircuitBreaker / RateLimiter / custom backoff.
+- **docs-sync-0.4 re-scope:** Folded the *freshness* half of 3-6 into a 0.3→0.4 docs catch-up PR; explicitly deferred the tutorial.
+- **This spec:** Closes the tutorial half, scoped to **the Middleware seam only** (Seam A in `engineering.md §3`). ResponseDecoder (Seam B) and the optional-extras pattern (Seam C) stay contributor-facing in `engineering.md` — surfacing them in user docs over-promises an extension surface users shouldn't be touching.
+- **Worked-example flavor:** non-resilience (Request-ID propagation) rather than CircuitBreaker. Demonstrates the protocol applies to anything cross-cutting, pairs naturally with the 0.6.0 observability events (correlate a `httpware.retry` record's `url` attribute with the X-Request-Id the middleware set), and avoids shipping a half-baked CircuitBreaker that would get cargo-culted into production.
+
+## Deliverable
+
+### New page: `docs/middleware.md`
+
+Approximately 150 lines markdown, structured as:
+
+1. **Intro (~5 lines).** What a middleware is in httpware; cross-cutting concerns it's the right tool for (auth, tracing, logging, custom resilience). Pointer to built-in `Retry`/`Bulkhead` for the common cases.
+
+2. **The Middleware protocol (~25 lines).** The `Middleware` `Protocol` and `Next` type alias, both already exported from `httpware.middleware`:
+
+   ```python
+   from collections.abc import Awaitable, Callable
+   from typing import Protocol, TypeAlias, runtime_checkable
+   import httpx2
+
+   Next: TypeAlias = Callable[[httpx2.Request], Awaitable[httpx2.Response]]
+
+   @runtime_checkable
+   class Middleware(Protocol):
+       async def __call__(self, request: httpx2.Request, next: Next) -> httpx2.Response: ...
+   ```
+
+   Explain: chain composed at `AsyncClient.__init__`, frozen for the client's lifetime. First in the `middleware=[...]` list is outermost (so `[Bulkhead, Retry]` puts Bulkhead outside Retry — one slot covers all attempts). `await next(request)` invokes the next layer; returning without calling it short-circuits the chain (synthesize a `Response` directly).
+
+3. **Phase decorators (~25 lines).** `@before_request`, `@after_response`, `@on_error` from `httpware.middleware` as ergonomic shortcuts for the common cases:
+
+   - **Use these when:** you don't need state-keeping on `self`, and you don't need to wrap the full `await next(...)` call.
+   - **Reach for the raw Protocol when:** you need instance state (e.g., a counter), you need to inspect both the request AND the response (e.g., timing), or you need to interleave behavior around the call (e.g., circuit-breaker state mutation on both success and failure paths).
+
+   Show one minimal pair — a `@before_request` adding a header, and a `@on_error` translating an exception type — without dwelling.
+
+4. **Worked example: Request-ID propagation (~50 lines).** Full class-based middleware demonstrating the raw `Middleware` protocol with state-keeping (a configurable header name) plus both phases (set request header before forwarding, log the ID after the response). Uses `logging.getLogger("myapp.request_id")` — explicitly a *consumer* logger, NOT a `httpware.*` logger, to reinforce that the `httpware.*` namespace is reserved for library-emitted events. The example:
+
+   ```python
+   import logging
+   import uuid
+
+   import httpx2
+   from httpware import AsyncClient, Retry
+   from httpware.middleware import Next
+
+   _LOGGER = logging.getLogger("myapp.request_id")
+
+
+   class RequestIdMiddleware:
+       """Propagate a per-call X-Request-Id; log it on response.
+
+       Place OUTSIDE Retry so all attempts of the same call share one ID
+       (callable from the consumer's logs to httpware.retry's emitted events
+       via the matching `url` attribute).
+       """
+
+       def __init__(self, *, header: str = "X-Request-Id") -> None:
+           self._header = header
+
+       async def __call__(self, request: httpx2.Request, next: Next) -> httpx2.Response:  # noqa: A002
+           request_id = str(uuid.uuid4())
+           request.headers[self._header] = request_id
+           response = await next(request)
+           _LOGGER.info("request complete", extra={"request_id": request_id, "status": response.status_code})
+           return response
+
+
+   async def main() -> None:
+       async with AsyncClient(
+           base_url="https://api.example.com",
+           middleware=[RequestIdMiddleware(), Retry()],  # ID outside Retry
+       ) as client:
+           await client.get("/users/1")
+   ```
+
+   Brief paragraph after: "Correlate with the 0.6.0 observability events — a `httpware.retry` `retry.giving_up` record carries the same `url` your middleware logged the ID against."
+
+5. **When NOT to write a middleware (~15 lines).** Tight callbacks to existing patterns:
+   - **Redaction:** use a `logging.Filter` on the consumer side (per the 0.6.0 observability spec's no-redaction-in-httpware stance).
+   - **URL / header validation:** `httpx2` owns it; don't reimplement.
+   - **Per-call behavior with no cross-cutting state:** pass through `request.extensions=` or the call-site `extensions=` kwarg instead.
+   - **Span creation for HTTP tracing:** install `opentelemetry-instrumentation-httpx` — don't write an OTel middleware in httpware (see `engineering.md §8` for why `5-4` was retired).
+
+6. **Cross-references (~5 lines).**
+   - `engineering.md §3 Seam A` — the formal protocol contract
+   - `src/httpware/middleware/resilience/` — `Retry`, `Bulkhead`, `RetryBudget` as real-world examples reading the same protocol
+   - `docs/index.md#with-resilience-middleware` — composition with built-ins
+
+### Touchups
+
+- **`mkdocs.yml`:** add `- Middleware: middleware.md` to the nav, between `Quick-Start` and `Development`.
+- **`README.md`:** in the existing "With resilience middleware" subsection, append one sentence: "_Need a custom middleware (auth, tracing, request-ID propagation)? See [`docs/middleware.md`](docs/middleware.md)._"
+- **`docs/index.md`:** in the "Where to go next" section, add one bullet: "**[Middleware guide](middleware.md)** — write your own middleware (Request-ID example included)."
+- **`planning/engineering.md` §8:** replace the existing Epic 3 closing line ("**Remaining:** `3-6` extension-slot docs.") with: "**Epic 3 — Resilience: SHIPPED.** v0.4 shipped `Retry` + `RetryBudget` + `Bulkhead`; v0.7 ships `3-6` extension-slot docs (`docs/middleware.md`)."
+- **`planning/releases/0.7.0.md`:** new file. Short doc-only release notes — calls out the new middleware guide, closes Epic 3, no API changes.
+
+## Non-goals (explicit)
+
+- **No code changes.** This is a docs-only PR. No middleware additions, no protocol extensions, no new public exports.
+- **No CircuitBreaker / RateLimiter / custom-resilience example.** The user explicitly chose a non-resilience example to avoid shipping a half-baked toy that gets cargo-culted.
+- **No ResponseDecoder (Seam B) or optional-extras (Seam C) coverage.** Those stay in `engineering.md` (contributor-facing).
+- **No mkdocs publish / docs-site infra work.** That's Epic 6 story `6-2`; the site_url is still readthedocs.io and we don't try to make it actually publish here.
+- **No version bump in `pyproject.toml`.** Tag-driven release (`uv version $GITHUB_REF_NAME` overwrites at build time).
+- **No `# noqa`s in the example code beyond `# noqa: A002`** (matches the convention already in `src/httpware/middleware/__init__.py` for the `next` parameter name).
+- **No CLAUDE.md changes.**
+
+## Verification gates
+
+- `uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -10` → 0 warnings (matches the gate the 0.6.0 work used).
+- All cross-reference links in the new page and the README/docs touchups resolve.
+- The Request-ID example compiles under `ty` if extracted (verified locally during implementation; not committed as a test).
+- Architecture-invariant grep suite still PASSes (no source files modified, but the grep should run anyway for hygiene).
+- Full test suite still passes (no code changes, but `just test` should be a no-op confirmation).
+
+## Release shape
+
+- **Version:** 0.7.0 (semver minor — public docs surface grows but no API).
+- **Branch:** `feat/v0.7-middleware-docs`.
+- **PR:** docs-only, expected ~250 lines markdown net new.
+- **Tag:** `0.7.0` after merge; GitHub Release reads from `planning/releases/0.7.0.md`.
+- **Publish workflow:** unchanged — the tag-driven publish runs even for docs-only releases, but the only artifact difference is the package metadata's classifier set is unchanged.
diff --git a/planning/specs/2026-06-05-v0.7-docs-expansion-design.md b/planning/specs/2026-06-05-v0.7-docs-expansion-design.md
new file mode 100644
index 0000000..e1cef33
--- /dev/null
+++ b/planning/specs/2026-06-05-v0.7-docs-expansion-design.md
@@ -0,0 +1,312 @@
+# Spec: v0.7 docs expansion (Middleware + Resilience + Errors + Testing)
+
+**Date:** 2026-06-05
+**Topic slug:** `v0.7-docs-expansion`
+**Status:** drafted, awaiting user review
+**Target release:** 0.7.0 (bundle onto the open PR #28 / `feat/v0.7-middleware-docs` branch — do NOT branch fresh)
+**Predecessor spec:** [`planning/specs/2026-06-05-extension-slot-docs-design.md`](2026-06-05-extension-slot-docs-design.md) — that one shipped the Middleware guide as the first commit on the same branch; this spec expands the scope of the same release.
+
+## Purpose
+
+Ship the rest of the first-cut user docs surface in the same 0.7 release that already has the Middleware guide queued. After this lands, every shipped feature through 0.6 has a user-facing reference page (`docs/middleware.md`, `docs/resilience.md`, `docs/errors.md`), and the two highest-friction adoption recipes (`docs/testing.md`, OpenTelemetry wire-up appended to `docs/middleware.md`) are documented.
+
+This expansion follows the **user-docs-philosophy**: hand-written reference pages for shipped features + concrete setup-friction recipes. No API autodoc, no benchmarks, no migration guide, no speculative cookbook.
+
+## Background
+
+PR #28 currently scopes "extension-slot docs" — the deferred-tutorial half of story 3-6. During post-spec discussion the user pointed out that several other reference/recipe pages would close real user pain (resilience parameter reference, errors catching strategies, testing recipe, tracing setup) and chose to bundle them into the same 0.7 release rather than queue a v0.8. This spec captures the expanded scope so the implementation plan covers all four additions consistently.
+
+## Deliverable
+
+### New page 1: `docs/resilience.md` (~180 lines)
+
+Deep-dive reference for the three resilience primitives shipped through 0.4/0.6.
+
+1. **Intro (~10 lines)** — what each primitive does in one sentence, when to reach for each, how they compose (`Bulkhead` outside `Retry` so one slot covers all attempts). Pointer to the [Middleware guide](middleware.md) for writing custom resilience.
+
+2. **`Retry` (~60 lines)** — full parameter table:
+
+   | Parameter | Default | Effect |
+   |---|---|---|
+   | `max_attempts` | `3` | Total tries (including the first). `1` disables retries entirely; `<1` raises `ValueError`. |
+   | `base_delay` | `0.1` (s) | Floor for the full-jitter exponential backoff. |
+   | `max_delay` | `5.0` (s) | Ceiling for backoff. |
+   | `attempt_timeout` | `None` | If set, each individual attempt is wrapped in `asyncio.timeout(attempt_timeout)`. |
+   | `retry_status_codes` | `frozenset({408, 429, 502, 503, 504})` | Status codes considered retryable. |
+   | `retry_methods` | `frozenset({"GET", "HEAD", "OPTIONS", "PUT", "DELETE"})` | Idempotent methods only by default. POST excluded; pass an explicit frozenset including `"POST"` to retry it. |
+   | `respect_retry_after` | `True` | When the response carries a `Retry-After` header on a retryable status, sleep for the header value (clamped to `max_delay`) instead of the jittered backoff. |
+   | `budget` | `RetryBudget()` (default-configured) | The token bucket; pass a shared `RetryBudget` instance to apply one budget across multiple clients. |
+
+   Sub-sections:
+   - **Retry-After parsing** — accepts integer seconds OR HTTP-date (RFC 5322 / `email.utils.parsedate_to_datetime` format). Malformed values are ignored; falls back to jittered backoff.
+   - **Streaming-body refusal** — if the request body was an async-iterable, Retry refuses to retry (the iterator is consumed after the first attempt). A PEP 678 note is added to the surfaced exception. The retry behavior at non-idempotent early-exit sites is identical. Cross-reference: [streaming docs](index.md#streaming-responses) and the `httpware.retry` `retry.streaming_refused` observability event.
+   - **Exhaustion behavior** — re-raises the *last* exception (e.g., `ServiceUnavailableError`), preserving exception class so `except ServiceUnavailableError` still catches it. A PEP 678 note is added: `httpware: gave up after N attempts`.
+
+3. **`RetryBudget` (~50 lines)** — the Finagle-style token bucket. Parameter table:
+
+   | Parameter | Default | Effect |
+   |---|---|---|
+   | `ttl` | `10.0` (s) | Sliding window over which deposits and withdrawals count. |
+   | `min_retries_per_sec` | `10.0` | Absolute floor — at least this many retries/sec are permitted regardless of deposit rate. |
+   | `percent_can_retry` | `0.2` | Fraction of recent deposits that can convert to retries (above the floor). |
+
+   Sub-sections:
+   - **The token-bucket formula:** `ceiling = int(len(deposits_in_window) * percent_can_retry) + int(min_retries_per_sec * ttl)`. Withdrawals fail when `len(withdrawn_in_window) >= ceiling`.
+   - **Why a floor matters:** if the deposit rate is zero (no traffic yet), the percent term is zero — without the floor, the very first retry would be refused. The floor lets small-traffic clients still retry; high-traffic clients are dominated by the percent term.
+   - **Sharing across clients** — pass the same `RetryBudget` instance to multiple `AsyncClient(middleware=[Retry(budget=shared)])` to enforce one joint budget. Useful when multiple clients hit the same downstream.
+   - **Single-thread assumption** — `RetryBudget` is asyncio-aware (deque mutations between await points are atomic on a single event loop); cross-thread use is out of scope.
+
+4. **`Bulkhead` (~30 lines)** — concurrency limiter. Parameter table:
+
+   | Parameter | Default | Effect |
+   |---|---|---|
+   | `max_concurrent` | **REQUIRED** | Maximum in-flight requests. `<1` raises `ValueError`. No default — the right cap depends on downstream capacity. |
+   | `acquire_timeout` | `1.0` (s) | How long to wait for a slot before raising `BulkheadFullError`. `None` waits forever; `0` fails fast. `<0` raises `ValueError`. |
+
+   Sub-sections:
+   - **Slot release** — releases on success, exception, AND cancellation (a `try/finally` around `await next(request)`). The slot can't leak.
+   - **Sharing across clients** — same pattern as `RetryBudget`. One instance, many clients.
+   - **Composition with `Retry`** — `middleware=[Bulkhead(...), Retry()]`. One slot covers all retry attempts of a single call.
+
+5. **Composition guidance (~20 lines)** — the canonical ordering, why it matters, what happens if you flip it (each attempt grabs a fresh slot — defeats the bulkhead). Brief note that custom middleware that emit cross-cutting state (e.g., the Request-ID middleware from the [Middleware guide](middleware.md)) should sit outside `Retry` for the same reason.
+
+6. **Cross-references** — `planning/engineering.md §3` (Seam A), the resilience module path, and the [Observability](index.md#observability) section for the events these middleware emit.
+
+### New page 2: `docs/errors.md` (~120 lines)
+
+The full exception tree and how to catch what.
+
+1. **Intro (~10 lines)** — `httpware` raises typed exceptions automatically; the tree is rooted at `ClientError`; HTTP-status responses raise status-keyed `StatusError` subclasses. Pointer to the [resilience reference](resilience.md) for the resilience-specific errors.
+
+2. **The tree (~30 lines)** — ASCII tree diagram showing the full hierarchy:
+
+   ```
+   ClientError                          (catch-all for anything httpware raises)
+   ├── TransportError                   (connection/network/protocol failure pre-response)
+   │   └── NetworkError                 (transient — safe to retry; covered by Retry's defaults)
+   ├── TimeoutError                     (also inherits builtins.TimeoutError — except OSError catches it)
+   ├── StatusError                      (got a response but its status was 4xx/5xx)
+   │   ├── ClientStatusError            (any 4xx — fallback for unknown 4xx codes)
+   │   │   ├── BadRequestError          (400)
+   │   │   ├── UnauthorizedError        (401)
+   │   │   ├── ForbiddenError           (403)
+   │   │   ├── NotFoundError            (404)
+   │   │   ├── ConflictError            (409)
+   │   │   ├── UnprocessableEntityError (422)
+   │   │   └── RateLimitedError         (429)
+   │   └── ServerStatusError            (any 5xx — fallback for unknown 5xx codes)
+   │       ├── InternalServerError     (500)
+   │       └── ServiceUnavailableError (503)
+   ├── RetryBudgetExhaustedError       (a retry was needed but the budget refused)
+   └── BulkheadFullError                (acquire_timeout elapsed before a slot opened)
+   ```
+
+3. **Status-code mapping table (~15 lines)** — the `STATUS_TO_EXCEPTION` table verbatim, plus the fallback rule (`400 ≤ status < 500` → `ClientStatusError`; `500 ≤ status < 600` → `ServerStatusError`).
+
+4. **Catching strategies (~25 lines)** — practical patterns:
+
+   ```python
+   from httpware import (
+       AsyncClient, ClientError, StatusError, NetworkError, TimeoutError,
+       NotFoundError, RetryBudgetExhaustedError, BulkheadFullError,
+   )
+
+   try:
+       user = await client.get("/users/1", response_model=User)
+   except NotFoundError:
+       # Specific status — most precise
+       return None
+   except StatusError as exc:
+       # Got a response, but it's an error status. exc.response.* available.
+       _LOGGER.warning("got %s for %s", exc.response.status_code, exc.response.request.url)
+       raise
+   except NetworkError:
+       # Transient transport failure. Already retried by default Retry middleware
+       # if installed and the method was idempotent. If you see this, retries
+       # exhausted or method was non-idempotent.
+       raise
+   except (RetryBudgetExhaustedError, BulkheadFullError) as exc:
+       # Resilience refusal. Backpressure signal — back off the caller.
+       _LOGGER.error("resilience refused: %s", exc)
+       raise
+   except ClientError:
+       # Catch-all for anything else httpware raised.
+       raise
+   ```
+
+   Plus a brief paragraph: `TimeoutError` is doubly-inherited; `except builtins.TimeoutError` and `except OSError` both catch it (matches what `asyncio.wait_for` raises).
+
+5. **`exc.response.*` access pattern (~20 lines)** — the response object on `StatusError` subclasses is a `httpx2.Response`. Examples:
+
+   ```python
+   exc.response.status_code   # 404
+   exc.response.headers       # httpx2.Headers — case-insensitive
+   exc.response.content       # raw bytes
+   exc.response.text          # decoded body
+   exc.response.json()        # parsed JSON
+   exc.response.request       # the failing httpx2.Request
+   exc.response.request.url   # the failing URL
+   ```
+
+   Note: `__repr__` and the exception summary strip `user:pass@` userinfo from the URL to avoid credential leaks in tracebacks. Query-string secrets are NOT stripped — keep secrets out of query strings.
+
+6. **`RetryBudgetExhaustedError` / `BulkheadFullError` payloads (~15 lines)** — what's on each:
+   - `RetryBudgetExhaustedError`: `last_response`, `last_exception`, `attempts`
+   - `BulkheadFullError`: `max_concurrent`, `acquire_timeout`
+
+7. **Cross-references** — [resilience reference](resilience.md), [middleware guide](middleware.md) for the on_error decorator that can translate exceptions.
+
+### New page 3: `docs/testing.md` (~80 lines)
+
+The mock-transport injection pattern.
+
+1. **Intro (~10 lines)** — `httpware`'s test seam is `httpx2`. Pass any `httpx2.AsyncClient` (including one built on `httpx2.MockTransport`) to `AsyncClient(httpx2_client=...)`. The middleware chain still runs; only the wire is mocked.
+
+2. **The basic pattern (~20 lines)** — the canonical example, mirroring how the project's own test files do it:
+
+   ```python
+   from http import HTTPStatus
+
+   import httpx2
+   import pytest
+
+   from httpware import AsyncClient
+
+
+   def handler(request: httpx2.Request) -> httpx2.Response:
+       return httpx2.Response(HTTPStatus.OK, json={"id": 1, "name": "Alice"})
+
+
+   async def test_get_user() -> None:
+       transport = httpx2.MockTransport(handler)
+       async with AsyncClient(httpx2_client=httpx2.AsyncClient(transport=transport)) as client:
+           response = await client.get("https://api.example.test/users/1")
+       assert response.status_code == HTTPStatus.OK
+   ```
+
+   Note: `pytest-asyncio` in auto-mode means async tests don't need `@pytest.mark.asyncio` (configure once in `pyproject.toml`).
+
+3. **Recording / stateful handlers (~20 lines)** — pattern for handlers that vary by call count or record requests:
+
+   ```python
+   class _ResponseSequence:
+       def __init__(self, statuses: list[int]) -> None:
+           self._statuses = list(statuses)
+           self.calls: list[httpx2.Request] = []
+
+       def __call__(self, request: httpx2.Request) -> httpx2.Response:
+           self.calls.append(request)
+           status = self._statuses.pop(0) if self._statuses else 200
+           return httpx2.Response(status, request=request)
+   ```
+
+4. **Testing your custom middleware (~15 lines)** — compose your middleware with the mock transport to exercise the chain end-to-end:
+
+   ```python
+   async def test_my_middleware_adds_header() -> None:
+       handler = _ResponseSequence([HTTPStatus.OK])
+       async with AsyncClient(
+           httpx2_client=httpx2.AsyncClient(transport=httpx2.MockTransport(handler)),
+           middleware=[MyHeaderMiddleware()],
+       ) as client:
+           await client.get("https://example.test/x")
+       assert handler.calls[0].headers["X-My-Header"] == "expected-value"
+   ```
+
+5. **What about `respx`? (~5 lines)** — `httpware` deliberately uses `httpx2.MockTransport` instead. The mock transport is the public test seam in httpx; `respx` patches private internals and breaks across httpx versions. Stick with `MockTransport`.
+
+6. **Cross-references** — [Middleware guide](middleware.md) for writing the middleware you're testing; engineering.md `§6` for the project's own testing patterns.
+
+### Edit: append "Wiring OpenTelemetry" section to `docs/middleware.md`
+
+After the existing "When NOT to write a middleware" section and before "See also", insert ~30 lines:
+
+```markdown
+## Wiring OpenTelemetry
+
+`httpware[otel]` only ships `opentelemetry-api`. To make the observability events emitted by `Retry` and `Bulkhead` visible, you also need:
+
+- An **SDK** (`opentelemetry-sdk`) to actually collect spans
+- An **HTTP instrumentor** (`opentelemetry-instrumentation-httpx`) so each HTTP call creates a span — `httpware`'s events attach to that span via `trace.get_current_span().add_event(...)`
+
+Minimal setup (console exporter for development):
+
+```python
+from opentelemetry import trace
+from opentelemetry.sdk.trace import TracerProvider
+from opentelemetry.sdk.trace.export import BatchSpanProcessor, ConsoleSpanExporter
+from opentelemetry.instrumentation.httpx import HTTPXClientInstrumentor
+
+trace.set_tracer_provider(TracerProvider())
+trace.get_tracer_provider().add_span_processor(BatchSpanProcessor(ConsoleSpanExporter()))
+HTTPXClientInstrumentor().instrument()
+```
+
+After this runs, every `httpware` HTTP call gets an `HTTP <method>` span from the instrumentor, and Retry/Bulkhead observability events appear as span events on it (no extra configuration needed in `httpware` itself — the events fire whenever an active span is present).
+
+For production, swap `ConsoleSpanExporter` for your OTLP/Jaeger/Zipkin exporter. See the [OpenTelemetry Python docs](https://opentelemetry.io/docs/languages/python/) for the full SDK setup.
+```
+
+### Touchups
+
+- **`mkdocs.yml`** — add three nav entries (Resilience, Errors, Testing) so the final nav reads:
+  ```yaml
+  nav:
+    - Quick-Start: index.md
+    - Resilience: resilience.md
+    - Middleware: middleware.md
+    - Errors: errors.md
+    - Testing: testing.md
+    - Development:
+        - Contributing: dev/contributing.md
+  ```
+  (Resilience precedes Middleware because most users *use* the built-ins before they *write* their own.)
+
+- **`docs/index.md`** — add three bullets to the "Where to go next" section, above the existing "Middleware guide" bullet that the prior commit on this branch added. After the edit the section reads:
+  ```markdown
+  ## Where to go next
+
+  - **[Resilience reference](resilience.md)** — every parameter on `Retry`, `RetryBudget`, and `Bulkhead`; the retry-rule matrix; Retry-After parsing; budget sharing.
+  - **[Middleware guide](middleware.md)** — write your own middleware. Covers the Middleware Protocol, the phase decorators, a worked Request-ID propagation example, and OpenTelemetry wiring.
+  - **[Errors reference](errors.md)** — the full exception tree, catching strategies, `exc.response.*` access pattern.
+  - **[Testing guide](testing.md)** — mock-transport injection pattern for testing code that uses `httpware`.
+  - **[Engineering Notes](https://github.com/modern-python/httpware/blob/main/planning/engineering.md)** — design invariants, the three protocol seams, exception contract, module layout, testing patterns, optional-extras pattern. Lives in the repo at `planning/engineering.md`.
+  - **[Contributing](dev/contributing.md)** — setup, conventions, workflow.
+  - **[Release notes](https://github.com/modern-python/httpware/releases)** — per-version changelogs.
+  ```
+
+- **`planning/engineering.md` §8** — the existing "Shipped in v0.7" line for Epic 3 gets enriched (just append a sentence): "_v0.7 also bundles the rest of the first-cut user docs surface — `docs/resilience.md`, `docs/errors.md`, `docs/testing.md` — and an OpenTelemetry wire-up section in `docs/middleware.md`._"
+
+- **`planning/releases/0.7.0.md`** — REWRITE. New title: "First-cut user docs (docs-only)". Body covers all four additions, names the four new pages, lists what's NOT in this release (still: no source changes, no API changes, no autodoc/benchmarks/cookbook), closes Epic 3.
+
+- **PR #28 title + body** — update via `gh pr edit` after the new commits push. New title: `feat(v0.7): first-cut user docs — Middleware + Resilience + Errors + Testing (closes Epic 3)`. Body reflects the expanded scope.
+
+## Non-goals (explicit)
+
+- **No source code changes.** Still docs-only across the whole 0.7 release.
+- **No new built-in middleware or features.** Documents what already shipped.
+- **No API autodoc / mkdocstrings layer** — per the user-docs-philosophy memory.
+- **No benchmarks** — per the same memory.
+- **No migration guide** — no one's on 0.1.
+- **No speculative cookbook recipes** — only the testing + tracing recipes, which are concrete adoption-friction fixes for already-shipping extras / patterns.
+- **No mkdocs publish workflow / docs-site infra.** Epic 6 story `6-2`; we keep `mkdocs build --strict` green and that's it.
+- **No new dedicated `docs/tracing.md` page** — the OTel wire-up rides as a section of `docs/middleware.md` (closer to the observability events it cross-references).
+- **No version bump in `pyproject.toml`** — tag-driven.
+- **No CLAUDE.md changes.**
+- **No new git branch.** Continue on `feat/v0.7-middleware-docs`. The new commits stack on top of the existing 6.
+
+## Verification gates
+
+- `uv run --with mkdocs --with mkdocs-material mkdocs build --strict 2>&1 | tail -10` → 0 warnings. Every internal anchor used in the new pages must resolve (cross-references between resilience.md, middleware.md, errors.md, testing.md, and index.md).
+- `just lint-ci` clean (no source files touched).
+- `just test` 251 passing, 100% coverage (no source files touched).
+- All architecture-invariant greps still PASS.
+- Manual link scan across the 3 new pages — `grep -nE '\]\(' docs/resilience.md docs/errors.md docs/testing.md docs/middleware.md` — every link target either resolves as a docs-internal anchor or is a clearly-external URL.
+
+## Release shape
+
+- **Version:** stays `0.7.0`.
+- **Branch:** stays `feat/v0.7-middleware-docs` (existing 6 commits + new commits stack on top).
+- **PR:** stays #28. Title + body updated after push.
+- **Tag/Release:** `0.7.0` after merge; GitHub Release reads from the rewritten `planning/releases/0.7.0.md`.
+- **Per-page commit cadence:** matches the prior 0.7 commits — one commit per page / per touchup.