Skip to content

release: v0.3.0#8

Merged
random-walks merged 1 commit into
mainfrom
release/v0.3.0
Apr 25, 2026
Merged

release: v0.3.0#8
random-walks merged 1 commit into
mainfrom
release/v0.3.0

Conversation

@random-walks

Copy link
Copy Markdown
Owner

Release PR for citeformer v0.3.0.

When this merges, push the local v0.3.0 annotated tag to trigger the PyPI publish workflow:

git push origin v0.3.0

(Tag created locally at commit c1b77e4; not yet pushed per the release skill default.)

Summary

Bundles the OpenRouter + Anthropic revamp + Fireworks + Together + async surface work landed across PRs #6 and #7 since v0.2.0. Three new backends (10 total now), full async parallel surface end-to-end, Anthropic backend revamp (prompt caching, real messages.stream(), cited_text preservation), GenerationResult.usage token-accounting, and 6 new ADRs (012-017) documenting the work + 3 explicit deferral decisions.

Suite at the bump: 644 unit tests + 4 schema integration + 8 connectivity (4 live-passed). Ruff check + format clean, mypy strict (53 src files), sphinx-build -W green.

§10.3 contract bump: GenerationResult.schema_version 2 → 3 (additive — usage field, plus three optional Citation fields). Pre-bump v2 serialisations deserialise cleanly into v3. See ADR-012 and ADR-013 for the ceremony.

What's in the box

Backends now: 10 (was 7 at v0.2.0):

  • New: OpenRouterBackend (multi-provider routing with provider.require_parameters), FireworksBackend (native GBNF — drops citeformer's cite-id rule in unchanged for true logit-tier on a hosted API), TogetherBackend (strict json_schema on Llama / Qwen / DeepSeek).
  • Revamped: AnthropicBackend — prompt caching on by default, real messages.stream() block-level streaming, cited_text / source_span / document_title preserved on every Citation, temperature no longer silently dropped.

Async surface end-to-end (ADR-014): Backend.agenerate / astream with asyncio.to_thread defaults; Citeformer.agenerate / astream returning new AsyncStreamingResult; native overrides on OpenAI + Anthropic (cascading to OpenRouter / Fireworks / Together via subclass inheritance).

Token usage on every call (ADR-012): GenerationResult.usage populated for all five API backends; OpenRouter additionally surfaces per-call cost in cost_credits.

verify() against cited_text: when Anthropic populates the cited span, NLI scores against just that span — sharper signal on long documents.

Doc-pin verification caught and fixed two real OpenRouter correctness issues (deprecated usage:{include:true} flag, cost_usd mislabel) before they shipped.

Tier-honesty docs rewrite — README, docs/index.md, architecture doc all reframed away from the stale "schema-tier vs logit-tier" split (every modern provider's strict structured-outputs is real token-level constrained sampling now). Honest distinction is where the masking runs (in-process vs provider-runtime).

Changelog

See CHANGELOG.md [0.3.0] section for the full per-feature breakdown.

Test plan

  • make lint green (ruff check + format + mypy strict, 53 src files).
  • make test green (644 unit + 4 schema integration; 40 deselected integration/gpu/network).
  • make docs-build green (sphinx-build -W).
  • uv build builds sdist + wheel cleanly into dist/.
  • citeformer.__version__ == "0.3.0" after the bump.
  • No TODO / FIXME / XXX in src/.
  • CHANGELOG [0.3.0] populated with substantive content; new empty [Unreleased] above; compare links updated.
  • Annotated tag v0.3.0 created locally pointing at the bump commit (push manually after merge).

🤖 Generated with Claude Code

@random-walks random-walks merged commit 054c7d2 into main Apr 25, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant