diff --git a/planning/plans/2026-06-08-otel-partial-install-plan.md b/planning/plans/2026-06-08-otel-partial-install-plan.md new file mode 100644 index 0000000..f2a3151 --- /dev/null +++ b/planning/plans/2026-06-08-otel-partial-install-plan.md @@ -0,0 +1,500 @@ +# OTel Partial-Install Hardening Implementation Plan (0.8.4) + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Land 3 commits on branch `fix/otel-partial-install` that close the 2 OTel partial-install audit findings, draft release notes for 0.8.4, and open a PR. + +**Architecture:** Two surgical defensive fixes: (1) `import_checker.is_otel_installed` probes `opentelemetry.trace` (a sub-module that ships with `opentelemetry-api`) instead of the bare `opentelemetry` namespace; (2) `_emit_event` wraps the lazy `from opentelemetry import trace` in `try/except ImportError` so a partial install degrades to log-only emission instead of crashing a live request. Tests exercise the broken-install crash path via `sys.modules` monkey-patching. + +**Tech Stack:** Python 3.11+, `importlib.util.find_spec`, `pytest`. No new dependencies. No public API change. + +--- + +## Spec reference + +`planning/specs/2026-06-08-otel-partial-install-design.md`. Decisions locked there (not re-debated): probe target is `opentelemetry.trace`; the try/except catches `ImportError` only (not bare `Exception`); the `add_event` call stays outside the try block; module docstring grows one sentence about the soft-fallback behavior. + +## File structure + +``` +src/httpware/_internal/import_checker.py # Task 1 — find_spec target change + comment +src/httpware/_internal/observability.py # Task 2 — try/except wrap + docstring sentence +tests/test_optional_extras_otel_missing.py # Tasks 1, 2 — new tests +planning/releases/0.8.4.md # Task 3 — new file +``` + +No new source files. No file deletions. Branch is already created (`fix/otel-partial-install`); spec already committed at `a9858ea`. + +## A note on testability + +The `is_otel_installed` flag is computed at module load time. A unit test can confirm the LIVE probe matches expectations in the CI environment (where `opentelemetry-api` IS installed via `--all-extras`), but cannot directly exercise the "namespace-package false positive" without reinstalling packages. We do the next best thing: a test that documents the probe target, plus a runtime test that simulates `from opentelemetry import trace` failing via a broken `sys.modules['opentelemetry']` stand-in — this exercises the try/except path end-to-end without touching the install. + +--- + +## Task 1: `find_spec("opentelemetry.trace")` probe + verification test + +**Files:** +- Modify: `src/httpware/_internal/import_checker.py` +- Modify: `tests/test_optional_extras_otel_missing.py` + +Closes audit Low finding (`import_checker.py:8`). + +- [ ] **Step 1: Read current state** + +```bash +cat src/httpware/_internal/import_checker.py +``` + +Confirm: the file has exactly 3 `is_*_installed` constants at module top; the `is_otel_installed` line currently reads `is_otel_installed = find_spec("opentelemetry") is not None`. + +- [ ] **Step 2: Write the failing/regression-guard test FIRST** + +Append to `tests/test_optional_extras_otel_missing.py`: + +```python +def test_is_otel_installed_uses_opentelemetry_trace_probe() -> None: + """The install probe must require opentelemetry-api, not just the namespace package. + + `opentelemetry` is a PEP 420 namespace — instrumentation packages create the + directory even when `opentelemetry-api` is absent. Probing the bare namespace + would return a non-None spec and `is_otel_installed` would become True, then + the lazy import in `_emit_event` would raise ImportError at runtime. + + Probing `opentelemetry.trace` (which ships with `opentelemetry-api`) closes + the gap. This test pins that contract: production must probe the trace + sub-module, not the bare namespace. + """ + from importlib.util import find_spec + + # In CI (opentelemetry-api IS installed), both probes return non-None. + # The asserts below confirm the live state and document the chosen probe. + assert find_spec("opentelemetry.trace") is not None + from httpware._internal import import_checker + assert import_checker.is_otel_installed is True + + # The structural assertion: the module-load-time constant must be derived + # from the trace-sub-module probe. Read the source to enforce this. + source = ( + __import__("inspect") + .getsource(import_checker) + ) + assert "find_spec(\"opentelemetry.trace\")" in source, ( + "import_checker must probe opentelemetry.trace (PEP 420 namespace hazard); " + "see planning/audit/2026-06-07-deep-audit.md (Low finding on import_checker.py:8)." + ) +``` + +This test relies on the source check to guard against a future revert to `find_spec("opentelemetry")` — the live `find_spec` calls return non-None either way in CI, so source-level pinning is the regression guard. + +- [ ] **Step 3: Run the test — confirm it FAILS against current production** + +```bash +uv run pytest tests/test_optional_extras_otel_missing.py::test_is_otel_installed_uses_opentelemetry_trace_probe -x --no-cov -q +``` + +Expected: FAIL on the source-check assertion (production still has `find_spec("opentelemetry")`, not `find_spec("opentelemetry.trace")`). + +- [ ] **Step 4: Fix production — change the find_spec target** + +Edit `src/httpware/_internal/import_checker.py`. Use Edit tool with these exact strings: + +old_string: +```python +is_otel_installed = find_spec("opentelemetry") is not None +``` + +new_string: +```python +# opentelemetry/ is a PEP 420 namespace package — instrumentation packages create it +# even without opentelemetry-api. Probe opentelemetry.trace (ships with api) instead. +is_otel_installed = find_spec("opentelemetry.trace") is not None +``` + +- [ ] **Step 5: Run test — confirm it PASSES** + +```bash +uv run pytest tests/test_optional_extras_otel_missing.py::test_is_otel_installed_uses_opentelemetry_trace_probe -x --no-cov -q +``` + +Expected: PASS. + +- [ ] **Step 6: Run the full file** + +```bash +uv run pytest tests/test_optional_extras_otel_missing.py tests/test_observability.py tests/test_optional_extras_isolation.py tests/test_optional_extras_pydantic_missing.py -x --no-cov -q +``` + +Expected: all pass. (Other extras tests use the same module-load-time `is_*_installed` flags; the change is isolated to OTel detection.) + +- [ ] **Step 7: Lint** + +```bash +just lint-ci +``` + +Expected: green. + +- [ ] **Step 8: Commit** + +```bash +git add src/httpware/_internal/import_checker.py tests/test_optional_extras_otel_missing.py +git commit -m "$(cat <<'EOF' +fix(import-checker): probe opentelemetry.trace, not the bare namespace + +`opentelemetry` is a PEP 420 native namespace package — any +`opentelemetry-instrumentation-*` package creates the directory even when +`opentelemetry-api` is absent. `find_spec("opentelemetry")` then returns +non-None and `is_otel_installed` becomes True; the lazy +`from opentelemetry import trace` in `_emit_event` subsequently raises +ImportError mid-request. + +Probe `opentelemetry.trace` instead — it ships with `opentelemetry-api` +and is absent when the api package is. The check is now cheap and +correct under every install permutation we can construct. + +The new test source-checks the probe target so a future revert to the +bare-namespace probe trips the regression guard, even though the live +`find_spec` calls return non-None either way under `--all-extras`. + +Closes audit Low finding (import_checker.py:8) from +planning/audit/2026-06-07-deep-audit.md. + +Co-Authored-By: Claude Opus 4.7 (1M context) +EOF +)" +``` + +--- + +## Task 2: Wrap the lazy OTel import in `try/except ImportError` + +**Files:** +- Modify: `src/httpware/_internal/observability.py` +- Modify: `tests/test_optional_extras_otel_missing.py` + +Closes audit Low finding (`observability.py:40`). + +- [ ] **Step 1: Read current state** + +```bash +cat src/httpware/_internal/observability.py +``` + +Confirm: `_emit_event` body (lines 38-42) is: + +```python + logger.log(level, message, extra=attributes) + if import_checker.is_otel_installed: + from opentelemetry import trace # noqa: PLC0415 — lazy by design (optional-extras isolation) + + trace.get_current_span().add_event(event_name, attributes=attributes) +``` + +- [ ] **Step 2: Write the failing test FIRST** + +Append to `tests/test_optional_extras_otel_missing.py`: + +```python +def test_emit_event_survives_lazy_import_failure(caplog: pytest.LogCaptureFixture) -> None: + """When is_otel_installed=True but `from opentelemetry import trace` raises ImportError, + _emit_event must degrade to log-only emission rather than crash. + + Simulates the partial-install case: opentelemetry/ namespace directory exists (created by + some instrumentation package) but opentelemetry-api is missing or broken, so importing + `trace` from it fails. + """ + import sys + + class _BrokenOpenTelemetry: + """Stand-in for opentelemetry/ namespace directory without working api.""" + + def __getattr__(self, name: str) -> object: + msg = f"cannot import name {name!r} from 'opentelemetry'" + raise ImportError(msg) + + # Save and replace the real opentelemetry module for the duration of the test. + saved = sys.modules.pop("opentelemetry", None) + sys.modules["opentelemetry"] = _BrokenOpenTelemetry() # type: ignore[assignment] + try: + with ( + patch("httpware._internal.import_checker.is_otel_installed", True), + caplog.at_level(logging.WARNING, logger="httpware.test.otel_missing"), + ): + _emit_event( + _TEST_LOGGER, + "test.event", + level=logging.WARNING, + message="survives broken otel", + attributes={"k": "v"}, + ) + finally: + if saved is not None: + sys.modules["opentelemetry"] = saved + else: + sys.modules.pop("opentelemetry", None) + + # The structured log record still fired despite the OTel branch failing. + assert any(r.message == "survives broken otel" for r in caplog.records) +``` + +The `_TEST_LOGGER` constant already exists in this file from Step 1's Task. `patch` and `pytest` are already imported. + +- [ ] **Step 3: Run the test — confirm it FAILS** + +```bash +uv run pytest tests/test_optional_extras_otel_missing.py::test_emit_event_survives_lazy_import_failure -x --no-cov -q +``` + +Expected: FAIL with `ImportError: cannot import name 'trace' from 'opentelemetry'` propagating out of `_emit_event`. + +- [ ] **Step 4: Fix production — wrap the lazy import** + +Edit `src/httpware/_internal/observability.py`. Use Edit tool: + +old_string: +```python + logger.log(level, message, extra=attributes) + if import_checker.is_otel_installed: + from opentelemetry import trace # noqa: PLC0415 — lazy by design (optional-extras isolation) + + trace.get_current_span().add_event(event_name, attributes=attributes) +``` + +new_string: +```python + logger.log(level, message, extra=attributes) + if import_checker.is_otel_installed: + try: + from opentelemetry import trace # noqa: PLC0415 — lazy by design (optional-extras isolation) + except ImportError: + # opentelemetry namespace exists but the api package is broken or missing — + # degrade to log-only emission. The structured log record above has already fired. + return + trace.get_current_span().add_event(event_name, attributes=attributes) +``` + +- [ ] **Step 5: Update the `_emit_event` docstring** + +In the same file (`src/httpware/_internal/observability.py`), update the second numbered point of the docstring (around lines 28-32). Use Edit tool: + +old_string: +```python + 2. If ``opentelemetry-api`` is installed, calls + ``trace.get_current_span().add_event(event_name, attributes=attributes)``. + When no tracer is active, ``get_current_span()`` returns a ``NonRecordingSpan`` + whose ``add_event`` is a documented no-op — so the call is unconditional + behind the install gate. +``` + +new_string: +```python + 2. If ``opentelemetry-api`` is installed, calls + ``trace.get_current_span().add_event(event_name, attributes=attributes)``. + When no tracer is active, ``get_current_span()`` returns a ``NonRecordingSpan`` + whose ``add_event`` is a documented no-op — so the call is unconditional + behind the install gate. If the install gate is wrong (the namespace exists + but the api package is missing or broken), the lazy import raises + ``ImportError``; we degrade silently to log-only emission. +``` + +- [ ] **Step 6: Run the test — confirm it PASSES** + +```bash +uv run pytest tests/test_optional_extras_otel_missing.py::test_emit_event_survives_lazy_import_failure -x --no-cov -q +``` + +Expected: PASS. + +- [ ] **Step 7: Run the full OTel + observability test suite** + +```bash +uv run pytest tests/test_optional_extras_otel_missing.py tests/test_observability.py tests/test_retry.py tests/test_retry_sync.py tests/test_bulkhead.py tests/test_bulkhead_sync.py -x --no-cov -q +``` + +Expected: all pass. (The retry + bulkhead tests exercise `_emit_event` indirectly through middleware; a regression in the try/except would surface here.) + +- [ ] **Step 8: Lint** + +```bash +just lint-ci +``` + +Expected: green. + +- [ ] **Step 9: Commit** + +```bash +git add src/httpware/_internal/observability.py tests/test_optional_extras_otel_missing.py +git commit -m "$(cat <<'EOF' +fix(observability): wrap lazy OTel import in try/except ImportError + +_emit_event gated the OTel path on `if import_checker.is_otel_installed` but +ran `from opentelemetry import trace` unguarded. If is_otel_installed was +True yet the import failed (PEP 420 namespace false-positive in 0.8.3 and +earlier; or any future partial install / broken api package), the +ImportError escaped _emit_event and crashed the middleware calling it +(AsyncRetry, Retry, AsyncBulkhead, Bulkhead) mid-request. + +Wrap the lazy import in `try/except ImportError`. On failure, return — the +structured log record on the line above has already fired, so emission +degrades to log-only. + +Catch ImportError specifically, not bare Exception: misconfigured-tracer +crashes (RuntimeError, AttributeError) should still surface; only the +install-gate-is-wrong case is in scope. + +_emit_event's docstring grows one sentence describing the soft-fallback. + +Closes audit Low finding (observability.py:40) from +planning/audit/2026-06-07-deep-audit.md. + +Co-Authored-By: Claude Opus 4.7 (1M context) +EOF +)" +``` + +--- + +## Task 3: Draft 0.8.4 release notes + open PR + +**Files:** +- Create: `planning/releases/0.8.4.md` + +- [ ] **Step 1: Read the 0.8.1 release-notes file to mirror its shape** + +```bash +cat planning/releases/0.8.1.md +``` + +(0.8.3 had three behavioral changes so its release notes are heavier — 0.8.4 is a single defensive fix and should match the 0.8.1 shape more closely: TL;DR + "The gap" + "The fix" + Upgrade.) + +- [ ] **Step 2: Write `planning/releases/0.8.4.md`** + +Create with this exact content: + +```markdown +# httpware 0.8.4 — OTel partial-install no longer crashes a live request + +**Patch release. Defensive fix. No API change.** Closes the two paired audit findings tracking the OpenTelemetry partial-install hazard. + +## The gap + +`httpware`'s observability layer treats `opentelemetry-api` as an optional extra. It detects whether the extra is installed via `find_spec("opentelemetry")` at module load time, then takes the OTel branch in `_emit_event` only if the flag is True. + +Two flaws in that gate let a partial install crash a live request: + +1. `opentelemetry` is a PEP 420 native namespace package. Any `opentelemetry-instrumentation-*` package creates the `opentelemetry/` directory, so `find_spec("opentelemetry")` returns a non-None spec even when `opentelemetry-api` is absent. +2. The lazy `from opentelemetry import trace` inside `_emit_event` was not wrapped in `try/except`. With the false-positive flag from (1), the import then raised `ImportError` mid-emit, crashing the middleware calling `_emit_event` — `AsyncRetry`, `Retry`, `AsyncBulkhead`, `Bulkhead` — in the middle of a live HTTP request. + +The audit's [chunk-2 finding](../audit/2026-06-07-deep-audit.md) named both halves of the hole; this release closes both. + +## The fix + +Two changes: + +- `import_checker.is_otel_installed` now probes `find_spec("opentelemetry.trace")`. The `opentelemetry.trace` sub-module ships with `opentelemetry-api`, so the flag is True only when the api package is actually importable. +- `_emit_event` wraps the lazy import in `try/except ImportError`. On failure (corrupt install, future namespace surprise, monkey-patched `sys.modules`), emission degrades to log-only — the structured log record fires unconditionally; the OTel `add_event` call is skipped. + +We catch `ImportError` specifically, not bare `Exception`. Misconfigured-tracer crashes (RuntimeError, AttributeError out of `trace.get_current_span().add_event(...)`) still surface; only the install-gate-is-wrong case is in scope. + +## Upgrade + +```bash +uv add httpware==0.8.4 +# or +pip install -U 'httpware==0.8.4' +``` + +No import changes. No API surface changes. No behavior change on the happy path (api package installed and importable). The only observable change is "no longer crashes" on partial installs. +``` + +- [ ] **Step 3: Lint** + +```bash +just lint-ci +``` + +Expected: green. eof-fixer may add a trailing newline. + +- [ ] **Step 4: Commit** + +```bash +git add planning/releases/0.8.4.md +git commit -m "$(cat <<'EOF' +docs(release): draft 0.8.4 notes — OTel partial-install hardening + +Two paired defensive fixes (find_spec target + try/except wrap) close +the chunk-2 partial-install audit findings. No API change; the only +observable behavior change is "no longer crashes" on partial installs. + +Co-Authored-By: Claude Opus 4.7 (1M context) +EOF +)" +``` + +- [ ] **Step 5: Push the branch + open the PR** + +```bash +git push -u origin fix/otel-partial-install +``` + +```bash +gh pr create --base main --head fix/otel-partial-install --title "fix(otel): harden partial-install detection + lazy-import (0.8.4)" --body "$(cat <<'EOF' +## Summary + +Closes 2 of the remaining audit findings — the OpenTelemetry partial-install hazard from chunk 2. See [`planning/specs/2026-06-08-otel-partial-install-design.md`](planning/specs/2026-06-08-otel-partial-install-design.md) for the design and [`planning/releases/0.8.4.md`](planning/releases/0.8.4.md) for the user-facing release notes. + +## The hazard + +- `find_spec("opentelemetry")` returns truthy because `opentelemetry/` is a PEP 420 namespace package — created by any `opentelemetry-instrumentation-*` install. +- `is_otel_installed` becomes True even when `opentelemetry-api` is absent. +- The lazy `from opentelemetry import trace` in `_emit_event` was unguarded — raised ImportError mid-request, crashing the middleware. + +## The fixes (paired per the audit's recommendation) + +- `import_checker.is_otel_installed = find_spec("opentelemetry.trace") is not None` — closes the false-positive at detection time. +- `_emit_event` wraps the lazy import in `try/except ImportError` — closes the runtime crash, degrades to log-only emission. + +## Audit findings closed + +| Severity | File:line | Closed by | +|---|---|---| +| Low | \`_internal/import_checker.py:8\` | find_spec("opentelemetry.trace") probe | +| Low | \`_internal/observability.py:40\` | try/except ImportError wrap | + +## Test plan + +- [x] New test pins the probe target via source-level assertion (\`find_spec("opentelemetry.trace")\` must appear in import_checker.py). +- [x] New test simulates the partial-install crash via \`sys.modules['opentelemetry'] = _BrokenOpenTelemetry()\` and verifies \`_emit_event\` returns without raising while still emitting the structured log record. +- [x] \`just lint-ci\` and full test suite green after each commit. + +## Release + +Tag \`0.8.4\` from the merge SHA after this PR lands. + +🤖 Generated with [Claude Code](https://claude.com/claude-code) +EOF +)" +``` + +Verify the PR URL is returned. + +- [ ] **Step 6: Final verification** + +```bash +git log --oneline -5 +gh pr view --json url,state,title +``` + +Expected: 3 commits on top of the spec (`a9858ea`) — `import_checker` fix, `observability` fix, release notes. PR open against main. + +Report the PR URL. + +--- + +## Self-review notes + +- **Spec coverage:** Spec finding #1 = T1 (import_checker probe target). Spec finding #2 = T2 (try/except wrap + docstring extension). Spec tests section = T1 + T2 test steps. Spec release notes section = T3. PR opening = T3 Step 5. +- **Placeholder scan:** All code blocks are complete with verbatim old_string / new_string. Test bodies are complete. Commit messages are filled in. No "TBD" / "similar to". +- **Type/name consistency:** `_TEST_LOGGER` is reused from the existing file in Task 2 (the test file already defines it). The `_BrokenOpenTelemetry` class is local to one test. `find_spec("opentelemetry.trace")` is the exact same string in source and test source-check. +- **TDD ordering:** T1 and T2 each follow red-green-commit. T1's red is a source-level assertion (the live `find_spec` calls pass either way under `--all-extras`, so source-pinning is the regression guard). T2's red is a real runtime ImportError out of `_emit_event`. diff --git a/planning/releases/0.8.4.md b/planning/releases/0.8.4.md new file mode 100644 index 0000000..e46eacd --- /dev/null +++ b/planning/releases/0.8.4.md @@ -0,0 +1,33 @@ +# httpware 0.8.4 — OTel partial-install no longer crashes a live request + +**Patch release. Defensive fix. No API change.** Closes the two paired audit findings tracking the OpenTelemetry partial-install hazard. + +## The gap + +`httpware`'s observability layer treats `opentelemetry-api` as an optional extra. It detects whether the extra is installed via `find_spec("opentelemetry")` at module load time, then takes the OTel branch in `_emit_event` only if the flag is True. + +Two flaws in that gate let a partial install crash a live request: + +1. `opentelemetry` is a PEP 420 native namespace package. Any `opentelemetry-instrumentation-*` package creates the `opentelemetry/` directory, so `find_spec("opentelemetry")` returns a non-None spec even when `opentelemetry-api` is absent. +2. The lazy `from opentelemetry import trace` inside `_emit_event` was not wrapped in `try/except`. With the false-positive flag from (1), the import then raised `ImportError` mid-emit, crashing the middleware calling `_emit_event` — `AsyncRetry`, `Retry`, `AsyncBulkhead`, `Bulkhead` — in the middle of a live HTTP request. + +The audit's [chunk-2 finding](../audit/2026-06-07-deep-audit.md) named both halves of the hole; this release closes both. + +## The fix + +Two changes: + +- `import_checker.is_otel_installed` now probes via `importlib.metadata.distribution("opentelemetry-api")` (inside a try/except `PackageNotFoundError` block). This checks the package registry directly: True only when the `opentelemetry-api` distribution is actually installed, regardless of whether some other package created the `opentelemetry/` namespace directory. Note: the obvious alternative — `find_spec("opentelemetry.trace")` — was rejected because CPython resolves submodule probes by importing the parent namespace package, which would have broken the existing transitive-import isolation guarantee enforced by `tests/test_optional_extras_isolation.py`. The metadata probe has no `sys.modules` side effects. +- `_emit_event` wraps the lazy `from opentelemetry import trace` in `try/except ImportError`. On failure (corrupt install, future namespace surprise, monkey-patched `sys.modules`), emission degrades to log-only — the structured log record fires unconditionally; the OTel `add_event` call is skipped. + +We catch `ImportError` specifically, not bare `Exception`. Misconfigured-tracer crashes (RuntimeError, AttributeError out of `trace.get_current_span().add_event(...)`) still surface; only the install-gate-is-wrong case is in scope. + +## Upgrade + +```bash +uv add httpware==0.8.4 +# or +pip install -U 'httpware==0.8.4' +``` + +No import changes. No API surface changes. No behavior change on the happy path (api package installed and importable). The only observable change is "no longer crashes" on partial installs. diff --git a/planning/specs/2026-06-08-otel-partial-install-design.md b/planning/specs/2026-06-08-otel-partial-install-design.md new file mode 100644 index 0000000..8321067 --- /dev/null +++ b/planning/specs/2026-06-08-otel-partial-install-design.md @@ -0,0 +1,211 @@ +# Spec: OTel partial-install hardening (0.8.4) + +**Date:** 2026-06-08 +**Topic slug:** `otel-partial-install` +**Branch:** `fix/otel-partial-install` +**Target release:** `0.8.4` — patch (defensive fix, no behavioral change to the happy path) +**Status:** drafted, awaiting user review + +## Purpose + +Close the two paired audit findings the [deep audit](../audit/2026-06-07-deep-audit.md) flagged as Chunk 2's optional-extras partial-install cluster: + +| # | Severity | File | Headline | +|---|---|---|---| +| 1 | Low | `src/httpware/_internal/import_checker.py:8` | `find_spec("opentelemetry")` returns truthy for the PEP-420 namespace package even when `opentelemetry-api` is absent | +| 2 | Low | `src/httpware/_internal/observability.py:40` | `_emit_event` does not wrap the lazy `from opentelemetry import trace` in `try/except ImportError` | + +Both findings are about the same partial-install hazard: a user installs `opentelemetry-instrumentation-X` (or any other package that creates the `opentelemetry/` namespace directory) without `opentelemetry-api`. Today: + +1. `find_spec("opentelemetry")` returns non-None because the namespace directory exists. +2. `is_otel_installed` becomes `True`. +3. `_emit_event` takes the otel branch and runs `from opentelemetry import trace`. +4. The lazy import raises `ImportError` (no api package to provide `trace`). +5. The exception escapes `_emit_event` and crashes whatever middleware called it — `AsyncRetry`, `Retry`, `AsyncBulkhead`, `Bulkhead` — in the middle of a live request. + +Audit fix: probe a sub-module that requires the api package (`opentelemetry.trace`), AND wrap the lazy import + `add_event` call in `try/except ImportError` as belt-and-braces. Either fix alone closes most of the hole; both together close it under every install permutation we can construct. + +## Non-goals + +- **No new exception types.** Failures degrade silently to the structured-log-only path; the contract is "OTel emission is best-effort." +- **No detection of broken installs at import time.** The flag stays a boolean at module load; we don't add a startup diagnostic that warns about partial installs. +- **No change to `is_pydantic_installed` or `is_msgspec_installed`.** Those don't suffer the same namespace hazard — `pydantic` and `msgspec` are concrete packages, not PEP-420 namespaces. +- **No change to logger names, event names, or attributes.** Public observability surface is untouched. +- **No new public API.** `_emit_event` stays internal. + +## Architecture + +### Two changes, one PR + +1. `src/httpware/_internal/import_checker.py`: `find_spec("opentelemetry")` → `find_spec("opentelemetry.trace")`. The `opentelemetry.trace` sub-module ships with `opentelemetry-api`; absent the api package, even with the namespace directory present, `find_spec` returns None. + +2. `src/httpware/_internal/observability.py`: wrap the lazy import and the `.add_event` call in `try/except ImportError`. On failure, the function returns silently — the structured log record on line 38 has already fired. + +### Why both, not just one + +Either fix in isolation would close the most common partial-install case. Together: + +- `find_spec("opentelemetry.trace")` rules out the namespace-package false positive at module-load time. Cheap. +- `try/except ImportError` defends against unexpected runtime breakage — a `RuntimeError`-style import failure (corrupt install, syntax error in an instrumentation hook, monkey-patched `sys.modules`), or any future case where `is_otel_installed=True` but the import still fails. + +Pairing them respects the audit's explicit recommendation that "the same release closes both ends of the partial-install hole." + +### Why ImportError, not Exception + +The lazy import can only fail with `ImportError` (or its subclasses like `ModuleNotFoundError`). Other exceptions (RuntimeError, AttributeError) escaping `from opentelemetry import trace` would indicate a serious environment problem we don't want to swallow. Catching `Exception` would mask real bugs in tracer libraries; catching `ImportError` matches the specific failure mode we're defending against. + +The follow-up `trace.get_current_span().add_event(...)` call is documented by OTel as never raising in the no-tracer-installed path (returns `NonRecordingSpan` whose `add_event` is a no-op). It can still raise if a tracer is *misconfigured*. To stay narrow, we keep the `try/except ImportError` around only the import — the `add_event` call sits outside the try block on the happy path. If we get reports of misconfigured-tracer crashes, we widen the catch in a future patch. + +## Per-change details + +### 1. `import_checker.py` + +old_string: +```python +is_otel_installed = find_spec("opentelemetry") is not None +``` + +new_string: +```python +# opentelemetry/ is a PEP 420 namespace package — instrumentation packages create it +# even without opentelemetry-api. Probe opentelemetry.trace (ships with api) instead. +is_otel_installed = find_spec("opentelemetry.trace") is not None +``` + +### 2. `observability.py` + +The current `_emit_event` body: + +```python + logger.log(level, message, extra=attributes) + if import_checker.is_otel_installed: + from opentelemetry import trace # noqa: PLC0415 — lazy by design (optional-extras isolation) + + trace.get_current_span().add_event(event_name, attributes=attributes) +``` + +becomes: + +```python + logger.log(level, message, extra=attributes) + if import_checker.is_otel_installed: + try: + from opentelemetry import trace # noqa: PLC0415 — lazy by design (optional-extras isolation) + except ImportError: + # opentelemetry namespace exists but the api package is broken or missing — + # degrade to log-only emission. The structured log record above has already fired. + return + trace.get_current_span().add_event(event_name, attributes=attributes) +``` + +### 3. Module docstring update (observability.py) + +The existing module docstring describes "If `opentelemetry-api` is installed, calls `trace.get_current_span().add_event(...)`." Extend the relevant docstring paragraph to note the soft-fallback behavior on partial-install: + +old text (in the `_emit_event` docstring near line 28-32): +``` + 2. If ``opentelemetry-api`` is installed, calls + ``trace.get_current_span().add_event(event_name, attributes=attributes)``. + When no tracer is active, ``get_current_span()`` returns a ``NonRecordingSpan`` + whose ``add_event`` is a documented no-op — so the call is unconditional + behind the install gate. +``` + +new text: +``` + 2. If ``opentelemetry-api`` is installed, calls + ``trace.get_current_span().add_event(event_name, attributes=attributes)``. + When no tracer is active, ``get_current_span()`` returns a ``NonRecordingSpan`` + whose ``add_event`` is a documented no-op — so the call is unconditional + behind the install gate. If the install gate is wrong (the namespace exists + but the api package is missing or broken), the lazy import raises + ``ImportError``; we degrade silently to log-only emission. +``` + +## Tests + +Add two tests to `tests/test_optional_extras_otel_missing.py` (the existing file that already patches `is_otel_installed`): + +### Test 1: `_emit_event` survives `ImportError` from the lazy import + +Simulate the partial-install crash: `is_otel_installed=True` but the lazy `from opentelemetry import trace` raises `ImportError`. The function must: +- log the structured record (already does) +- NOT raise + +Approach: use `monkeypatch.setitem(sys.modules, 'opentelemetry', _BrokenModule())` where `_BrokenModule` is an object whose `__getattr__` raises `ImportError`. This makes `from opentelemetry import trace` fail with ImportError at the `__getattr__` step. + +```python +def test_emit_event_survives_lazy_import_failure(caplog: pytest.LogCaptureFixture) -> None: + class _BrokenOpenTelemetry: + """Stand-in for opentelemetry/ namespace directory without api package.""" + + def __getattr__(self, name: str) -> object: + msg = f"cannot import name {name!r} from 'opentelemetry'" + raise ImportError(msg) + + with ( + patch("httpware._internal.import_checker.is_otel_installed", new=True), + patch.dict("sys.modules", {"opentelemetry": _BrokenOpenTelemetry()}), + ): + with caplog.at_level(logging.INFO, logger="httpware.test"): + _emit_event( + logging.getLogger("httpware.test"), + "test.event", + level=logging.INFO, + message="test message", + attributes={"k": "v"}, + ) + # Must not have raised. Log record must still have fired. + assert any(r.message == "test message" for r in caplog.records) +``` + +### Test 2: assertion about the install-detection logic + +Document the new `find_spec("opentelemetry.trace")` check via a focused test: + +```python +def test_is_otel_installed_uses_opentelemetry_trace_probe() -> None: + """The install probe must require opentelemetry-api, not just the namespace package. + + Re-running find_spec at test time confirms the production module's choice. If this + fails, the module-load-time constant in import_checker.py is using the wrong probe. + """ + from importlib.util import find_spec + assert find_spec("opentelemetry.trace") is not None # opentelemetry-api IS installed in CI + # The boolean derived from the probe must match. + from httpware._internal import import_checker + assert import_checker.is_otel_installed is True +``` + +This test runs in the `--all-extras` environment where `opentelemetry-api` IS installed, so the live check holds. It would fail if a future refactor reverted to `find_spec("opentelemetry")` AND used a stale snapshot from CI without api installed. + +## Verification + +After each commit: + +```bash +just lint-ci +uv run pytest tests/test_optional_extras_otel_missing.py tests/test_observability.py -x --no-cov -q +``` + +Plus the full suite at the end of the PR: + +```bash +uv run pytest -x --no-cov -q +``` + +## Release notes + +`planning/releases/0.8.4.md` — mirror the 0.8.1/0.8.3 structure. One-section release; bug fix only; no API change. Note that the partial-install scenario degrades from "crashes a live request" to "silently logs and skips the OTel emission." + +## Acceptance criteria + +1. Two fix commits + one release-notes commit + (optional) test-extension commit on branch `fix/otel-partial-install`. +2. `just lint-ci` and `uv run pytest` green after every commit. +3. PR opened against `main` with title `fix(otel): harden partial-install detection + lazy-import (0.8.4)`. +4. After merge, tag `0.8.4` from the merge SHA; GitHub Release published from `planning/releases/0.8.4.md`. +5. Memory `release_0_8_4_shipped` added to MEMORY.md. + +## Open questions + +None. Both fixes are precisely specified by the audit; the tests are straightforward; the release-notes shape is established. diff --git a/src/httpware/_internal/import_checker.py b/src/httpware/_internal/import_checker.py index fe14582..8536ed2 100644 --- a/src/httpware/_internal/import_checker.py +++ b/src/httpware/_internal/import_checker.py @@ -1,8 +1,26 @@ """Detect optional extras without importing them. Used by adapter modules to gate hard imports.""" +from importlib.metadata import PackageNotFoundError, distribution from importlib.util import find_spec +def _is_distribution_installed(name: str) -> bool: + """Probe the package registry for a distribution by name. No sys.modules side effects.""" + try: + distribution(name) + except PackageNotFoundError: + return False + return True + + is_msgspec_installed = find_spec("msgspec") is not None is_pydantic_installed = find_spec("pydantic") is not None -is_otel_installed = find_spec("opentelemetry") is not None +# opentelemetry/ is a PEP 420 namespace package — instrumentation packages create the +# directory even without opentelemetry-api. find_spec("opentelemetry") therefore returns +# non-None regardless of whether the api package is present, and +# find_spec("opentelemetry.trace") populates sys.modules with the namespace parent as a +# CPython side-effect, breaking the isolation guarantee. +# importlib.metadata.distribution probes the package registry instead: it returns the +# distribution when opentelemetry-api is installed and raises PackageNotFoundError when +# it is absent, with no sys.modules side effects. +is_otel_installed = _is_distribution_installed("opentelemetry-api") diff --git a/src/httpware/_internal/observability.py b/src/httpware/_internal/observability.py index 00227cb..36551e2 100644 --- a/src/httpware/_internal/observability.py +++ b/src/httpware/_internal/observability.py @@ -29,7 +29,9 @@ def _emit_event( ``trace.get_current_span().add_event(event_name, attributes=attributes)``. When no tracer is active, ``get_current_span()`` returns a ``NonRecordingSpan`` whose ``add_event`` is a documented no-op — so the call is unconditional - behind the install gate. + behind the install gate. If the install gate is wrong (the namespace exists + but the api package is missing or broken), the lazy import raises + ``ImportError``; we degrade silently to log-only emission. The lazy ``from opentelemetry import trace`` inside the if-block preserves the optional-extras isolation invariant: ``import httpware`` must not pull @@ -37,6 +39,10 @@ def _emit_event( """ logger.log(level, message, extra=attributes) if import_checker.is_otel_installed: - from opentelemetry import trace # noqa: PLC0415 — lazy by design (optional-extras isolation) - + try: + from opentelemetry import trace # noqa: PLC0415 — lazy by design (optional-extras isolation) + except ImportError: + # opentelemetry namespace exists but the api package is broken or missing — + # degrade to log-only emission. The structured log record above has already fired. + return trace.get_current_span().add_event(event_name, attributes=attributes) diff --git a/tests/test_optional_extras_otel_missing.py b/tests/test_optional_extras_otel_missing.py index 1c1b815..c44b5f7 100644 --- a/tests/test_optional_extras_otel_missing.py +++ b/tests/test_optional_extras_otel_missing.py @@ -11,14 +11,18 @@ """ import logging +import sys +from importlib.metadata import PackageNotFoundError, distribution from unittest.mock import patch import pytest +from httpware._internal import import_checker from httpware._internal.observability import _emit_event _TEST_LOGGER = logging.getLogger("httpware.test.otel_missing") +_NOT_FOUND_MSG = "simulated absent package" def test_emit_event_logs_record_without_otel(caplog: pytest.LogCaptureFixture) -> None: @@ -56,3 +60,97 @@ def test_emit_event_does_not_call_opentelemetry_apis_when_flag_false() -> None: ) mock_get_span.assert_not_called() + + +def test_is_otel_installed_uses_opentelemetry_trace_probe() -> None: + """The install probe must use the package registry, not find_spec on the namespace. + + `opentelemetry` is a PEP 420 namespace — instrumentation packages create the + directory even when `opentelemetry-api` is absent. find_spec("opentelemetry") + returns non-None regardless, giving a false positive. + + find_spec("opentelemetry.trace") would fix the false-positive but causes CPython + to load the opentelemetry namespace package into sys.modules as a side-effect, + breaking the transitive-import isolation guarantee. + + importlib.metadata.distribution("opentelemetry-api") probes the package registry + directly: no sys.modules side-effects, and it raises PackageNotFoundError when + opentelemetry-api is absent. This test pins that contract. + """ + # In CI (opentelemetry-api IS installed), distribution succeeds. + assert distribution("opentelemetry-api") is not None + assert import_checker.is_otel_installed is True + + # The structural assertions: the module must probe via the metadata distribution + # call, keyed on "opentelemetry-api". Ensures a future revert to find_spec on the + # namespace package trips the regression guard. + source = __import__("inspect").getsource(import_checker) + fail_msg = ( + "import_checker must probe via importlib.metadata.distribution('opentelemetry-api') " + "(PEP 420 namespace hazard + sys.modules side-effect); " + "see planning/audit/2026-06-07-deep-audit.md (Low finding on import_checker.py:8)." + ) + assert "distribution" in source, fail_msg + assert '"opentelemetry-api"' in source, fail_msg + + +def test_emit_event_survives_lazy_import_failure( + caplog: pytest.LogCaptureFixture, + monkeypatch: pytest.MonkeyPatch, +) -> None: + """_emit_event degrades to log-only when the lazy OTel import fails. + + When is_otel_installed=True but ``from opentelemetry import trace`` raises + ImportError, _emit_event must not crash — it emits the structured log record + and returns. + + Simulates the partial-install case: opentelemetry/ namespace directory exists + (created by some instrumentation package) but opentelemetry-api is missing or + broken, so importing ``trace`` from it fails. + """ + + class _BrokenOpenTelemetry: + """Stand-in for opentelemetry/ namespace directory without working api.""" + + def __getattr__(self, name: str) -> object: + msg = f"cannot import name {name!r} from 'opentelemetry'" + raise ImportError(msg) + + # monkeypatch.setitem handles save/restore automatically — no manual finally. + monkeypatch.setitem(sys.modules, "opentelemetry", _BrokenOpenTelemetry()) + with ( + patch("httpware._internal.import_checker.is_otel_installed", True), + caplog.at_level(logging.WARNING, logger="httpware.test.otel_missing"), + ): + _emit_event( + _TEST_LOGGER, + "test.event", + level=logging.WARNING, + message="survives broken otel", + attributes={"k": "v"}, + ) + + # The structured log record still fired despite the OTel branch failing. + assert any(r.message == "survives broken otel" for r in caplog.records) + + +def test_is_distribution_installed_returns_false_on_package_not_found( + monkeypatch: pytest.MonkeyPatch, +) -> None: + """The probe helper returns False when distribution() raises PackageNotFoundError. + + Covers the partial-install detection path: opentelemetry-api absent → False. + Under --all-extras CI, opentelemetry-api IS installed, so the False branch is + only reachable by patching the underlying distribution() call. + """ + + def _raise_not_found(_name: str) -> None: + raise PackageNotFoundError(_NOT_FOUND_MSG) + + monkeypatch.setattr(import_checker, "distribution", _raise_not_found) + assert import_checker._is_distribution_installed("anything") is False # noqa: SLF001 + + +def test_is_distribution_installed_returns_true_for_known_installed_package() -> None: + """The probe helper returns True for a package that IS installed (opentelemetry-api in CI).""" + assert import_checker._is_distribution_installed("opentelemetry-api") is True # noqa: SLF001