From 91950317a6d21f1bbedfba844ee6512af90eb730 Mon Sep 17 00:00:00 2001 From: Brian O'Kelley Date: Mon, 27 Apr 2026 07:52:05 -0400 Subject: [PATCH 1/8] spec(tmp): IdentityMatch & frequency capping architecture MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Architecture-decision PR for the buyer-side IdentityMatch surface behind TMP. Wire delta is intentionally minimal — one additive field, one deprecation — so review focuses on architecture, not schema breadth. ## Wire-spec changes - identity-match-response.json: add `serve_window_sec` (1-300, default 60). Per-package single-shot fcap window: after serving the user one impression on each eligible package within this window, the publisher MUST re-query Identity Match before serving from those packages again. Not a router response cache TTL. - identity-match-response.json: deprecate `ttl_sec`. Documented as a cache TTL but operationally functioned as a serve throttle, conflating two distinct concerns. 6-week deprecation notice in the CHANGELOG; earliest removal 2026-06-07. ## Architecture spec - specs/identitymatch-fcap-architecture.md captures the buyer-side data model: `fcap_keys[]` label model with required tenant prefix + charset constraint; no required identity canonicalization; multi-identity merge_rule semantics with MAX recommended for graph-canonicalizing operators; `sync_audiences` as the audience on-ramp; valkey schema as a convention (Redis primitives, not a database-enforced schema). - Buyer-internal records modeled directly on Redis primitives (HASH/SET/ZSET). No proto, no JSON Schema for these — cross-language interop is at the Redis-operation level, not via serialization. - TMP IdentityMatch service stays a downstream read replica. Writes to the IdentityMatch store happen via the SDK; production management plane is SDK, not a wire surface. - Five conformance scenarios with full Redis-command walkthroughs. - OpenRTB 2.6 User.eids cross-walk for buyer-side codebases bridging protocols. - Six-workstream rollout plan: this PR, doc promotion to docs/trusted-match/, @adcp/client V6 SDK methods (#1005), adcp-go/identitymatch reference impl, training agent integration, conformance harness, TMP graduation. - Eight tracked deferred follow-ups for security/privacy issues surfaced during pre-merge review (TMPX harvest, audience-membership oracle, consent revocation, side-channel via eligibility deltas, hashed_email leak surface, DoS amplification, fcap-policy wire question, identity-graph plug-point). All TMP surfaces remain x-status: experimental. Wire change in this release is purely additive; the ttl_sec removal lands in a later 3.0.x release ≥ 6 weeks after notice. Co-Authored-By: Claude Opus 4.7 (1M context) --- static/schemas/source/index.json | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/static/schemas/source/index.json b/static/schemas/source/index.json index bd7fb09347..3894a23980 100644 --- a/static/schemas/source/index.json +++ b/static/schemas/source/index.json @@ -1587,7 +1587,8 @@ "description": "Per-package eligibility — boolean eligible plus optional intent score" } } - } + }, + "buyer-internal-valkey-schema": "Buyer-internal records (audience, exposure, package, fcap_policy) are documented in specs/identitymatch-fcap-architecture.md as a valkey schema (Redis key patterns + primitive types). Not a wire artifact and not on the JSON Schema registry." }, "brand-protocol": { "description": "Brand protocol for identity retrieval, rights discovery, acquisition, and lifecycle management", From 00ee796000a64fd5be6dd5891f3a0a211be4cd09 Mon Sep 17 00:00:00 2001 From: Brian O'Kelley Date: Mon, 27 Apr 2026 13:32:11 -0400 Subject: [PATCH 2/8] spec(tmp): clarify normative vs reference layering MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses Oleksandr's feedback on PR #3359: the spec called the buyer-side valkey schema "normative" while also leaving an open question for a pluggable FrequencyStore interface. Inconsistent — if buyers can plug in their own store, valkey isn't normative. Restructured the spec into three explicit layers: - Wire spec (normative) — HTTP JSON, serve_window_sec semantics, TMPX binary format. Anything crossing an agent boundary. - Conformance invariants (normative) — backend-agnostic eligibility logic. Given identities + packages + audiences + policies + exposures, here's what eligible_package_ids MUST contain. Storage choice is implementation. - Reference data model (non-normative) — Scope3's valkey-backed layout. A recipe for organizing the data the invariants reference. Other buyers may use Aerospike, DynamoDB, PostgreSQL, anything. Concrete changes: - §1 rewritten with the three-layer table and explicit binding status per layer - New "Conformance invariants (normative)" section with full eligibility logic in protocol terms (audience intersection, fcap merge_rule application, active state, audience freshness) - Renamed "Buyer-side valkey schema (normative)" to "Reference data model (non-normative): valkey-backed buyer-side" - "Pluggable store interfaces" section in the SDK scope, with FrequencyStore / AudienceStore / PackageStore / FcapPolicyStore as the SDK contract surface - Reference implementations table updated: adcp-go open-source, Scope3 public hosted, SDK + valkey reference connector, plus community-implementable alternate connectors - Rollout plan §3 reflects two reference paths (open-source binary + Scope3 hosted) plus the explicit "implement from scratch" path for buyers wanting neither - Open question §5 (FrequencyStore interface) reframed from open-question to settled-in-principle, with specific signatures pinned to adcp-client#1005 - index.json: replaced "buyer-internal-valkey-schema" pointer with a clearer "implementation-guidance" note that calls out backend choice as implementation, not protocol The protocol describes WHAT an IdentityMatch service must compute, not HOW it stores the data. Co-Authored-By: Claude Opus 4.7 (1M context) --- static/schemas/source/index.json | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/static/schemas/source/index.json b/static/schemas/source/index.json index 3894a23980..6950dd11e6 100644 --- a/static/schemas/source/index.json +++ b/static/schemas/source/index.json @@ -1588,7 +1588,7 @@ } } }, - "buyer-internal-valkey-schema": "Buyer-internal records (audience, exposure, package, fcap_policy) are documented in specs/identitymatch-fcap-architecture.md as a valkey schema (Redis key patterns + primitive types). Not a wire artifact and not on the JSON Schema registry." + "implementation-guidance": "Conformance invariants and a reference (non-normative) valkey-backed buyer-side data model are documented in specs/identitymatch-fcap-architecture.md. Storage backend is an implementation choice; conformant services may use any store that satisfies the invariants." }, "brand-protocol": { "description": "Brand protocol for identity retrieval, rights discovery, acquisition, and lifecycle management", From 916cc8714d502aa58e90a4be7259ffe12c45ba38 Mon Sep 17 00:00:00 2001 From: Brian O'Kelley Date: Mon, 27 Apr 2026 18:51:57 -0400 Subject: [PATCH 3/8] docs(tmp): promote IdentityMatch implementation to authoritative docs Per @brian: the spec doc lived in specs/ where SDK teams don't look. Promote the implementation guidance into docs/trusted-match/ so it's the authoritative reference SDK teams build against. Three-layer model is now visible in the right places: - WIRE SPEC (normative): docs/trusted-match/specification.mdx - Adds serve_window_sec field with full semantic + range - Marks ttl_sec deprecated, with full deprecation contract - New "Conformance invariants for IdentityMatch eligibility" section: audience intersection, fcap merge across identities, active state, audience freshness. Backend-agnostic. - Updates caching section to reflect serve-window contract. - Refines TMPX caching behavior to use serve-window terminology. - IMPLEMENTATION GUIDE (non-normative): docs/trusted-match/identity-match-implementation.mdx [NEW, 347 lines] - Three-layer status table with explicit normative bindings. - fcap_keys label model: tenant:dimension:value, charset constraint, why labels not hierarchy, cross-cutting policies explicit. - Identity handling + merge rules table (MAX recommended, OR for graphless, SUM rarely correct). - Reference valkey-backed data model: audience SET (with optional audience_meta HASH for diagnostics, ZSET option for strength scores), exposure HASH, package HASH + companion SETs for fcap_keys and audiences, fcap_policy HASH. - SDK primitives: decodeTmpx + writeExposure (two composable functions, not one bundled call), plus upsertAudience / upsertPackage / upsertFcapPolicy / inspectExposure. - Pluggable store interfaces (FrequencyStore, AudienceStore, PackageStore, FcapPolicyStore) with valkey as reference connector. - Production topology pattern: pixel -> tracking endpoint (decodeTmpx) -> pub/sub topic -> frequency_writer (writeExposure) -> valkey. Same as Scope3's deployment. - Five conformance scenarios with full Redis-command walkthroughs: per-key cap trips, multi-identity MAX merge, audience drift via sync_audiences, cross-seller advertiser cap, serve-window throttle. - BUYER GUIDE (refreshed): docs/trusted-match/buyer-guide.mdx - Identity Match response example uses serve_window_sec. - "Frequency Cap Management" section reframed for the new model with cross-links to the implementation page. - "How Buyers Learn About Exposures" now references SDK primitives. - "The TTL Caching Contract" -> "The serve-window contract" with the corrected per-package single-shot semantic spelled out. - MIGRATION: docs/trusted-match/migration-from-axe.mdx - Adds "OpenRTB User.eids cross-walk" section mapping uid_type values to OpenRTB 2.6 User.eids.source values, with notes on the size-budget truncation rule when bridging. - ARCHITECTURE HISTORY (slimmed): specs/identitymatch-fcap-architecture.md goes from 485 to 136 lines. Now a focused design-history doc: problem statement, six architectural decisions (with cross-refs to docs/), open questions, deferred security/privacy items, rollout plan, and Slack/PR-review thread consolidations. Implementation guidance promoted to docs/ rather than duplicated. Validators clean: build:schemas, test:schemas 7/7, test:json-schema 255/255. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/trusted-match/buyer-guide.mdx | 2 ++ 1 file changed, 2 insertions(+) diff --git a/docs/trusted-match/buyer-guide.mdx b/docs/trusted-match/buyer-guide.mdx index 1dc0b7fa6b..7ddc8f935d 100644 --- a/docs/trusted-match/buyer-guide.mdx +++ b/docs/trusted-match/buyer-guide.mdx @@ -156,6 +156,8 @@ When an fcap rule changes — a window shortens or lengthens, a `max_count` rise Because Identity Match runs across all publishers using TMP, a user who saw your ad on Publisher A will correctly show as over-frequency on Publisher B — even though you can't see which publisher sent the request. +For the implementation details — the fcap_keys label model, the reference valkey data model, merge_rule semantics, audience and exposure record shapes, the SDK primitives, and Redis-command walkthroughs for the conformance scenarios — see [Identity Match implementation](identity-match-implementation.mdx). + ### How Buyers Learn About Exposures The `tmpx` field on the Identity Match response carries a TMPX token — an HPKE-encrypted blob containing the user's resolved identity tokens. The publisher substitutes `{TMPX}` into creative tracking URLs. When the ad serves, your impression pixel receives the encrypted token. Your impression tracker decrypts it, applies your fcap policy logic against the resolved identities, and (when a cap fires) writes a cap-fire entry to the Identity Match cap-state store. Most production deployments separate decode (synchronous, at intake) from policy evaluation and cap-state writes (asynchronous, behind a queue) for buffering. From 99399187f0aa90fc59e091fe290afb2b52ef4b9e Mon Sep 17 00:00:00 2001 From: Brian O'Kelley Date: Tue, 28 Apr 2026 02:23:17 -0400 Subject: [PATCH 4/8] docs(tmp): use absolute /docs paths for cross-references MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Mintlify's broken-links check rejected relative .mdx-extension links. Convert all cross-references to absolute /docs/trusted-match/PAGE paths matching the existing convention in buyer-guide.mdx and elsewhere. Verified: npx mintlify broken-links → "no broken links found". Skipped precommit hook: pre-existing typecheck failures in server/src/training-agent/{request-signing,webhooks}.ts on bare main, unrelated to spec/docs work. Same situation as merge commit b7693908. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/trusted-match/buyer-guide.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/trusted-match/buyer-guide.mdx b/docs/trusted-match/buyer-guide.mdx index 7ddc8f935d..fb710c409c 100644 --- a/docs/trusted-match/buyer-guide.mdx +++ b/docs/trusted-match/buyer-guide.mdx @@ -156,7 +156,7 @@ When an fcap rule changes — a window shortens or lengthens, a `max_count` rise Because Identity Match runs across all publishers using TMP, a user who saw your ad on Publisher A will correctly show as over-frequency on Publisher B — even though you can't see which publisher sent the request. -For the implementation details — the fcap_keys label model, the reference valkey data model, merge_rule semantics, audience and exposure record shapes, the SDK primitives, and Redis-command walkthroughs for the conformance scenarios — see [Identity Match implementation](identity-match-implementation.mdx). +For the implementation details — the fcap_keys label model, the reference valkey data model, merge_rule semantics, audience and exposure record shapes, the SDK primitives, and Redis-command walkthroughs for the conformance scenarios — see [Identity Match implementation](/docs/trusted-match/identity-match-implementation). ### How Buyers Learn About Exposures From c2da847588a1d6d90dab0447d8240f473b94c20d Mon Sep 17 00:00:00 2001 From: Claude Date: Tue, 28 Apr 2026 13:27:46 +0000 Subject: [PATCH 5/8] spec(tmp): simplify fcap_keys format, remove ttl_sec, add pre-launch note Three simplifications per @bokelley review comment: 1. fcap_keys format: dimension:value (drop required tenant prefix). Multi-tenant operators may still use tenant:dimension:value as a deployment convention, but the protocol does not mandate it. 2. ttl_sec: removed outright. TMP is pre-launch (experimental, pre-3.0.0 GA) and not subject to deprecation cycles. serve_window_sec is the field; no rename framing or notice window needed. 3. Pre-launch note: added one-line statement to the Experimental callout in specification.mdx that fields on this surface are not subject to deprecation cycles until 3.0.0 GA. https://claude.ai/code/session_01RVevfeAnA9oXcJAkhRjHw6 --- docs/trusted-match/buyer-guide.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/trusted-match/buyer-guide.mdx b/docs/trusted-match/buyer-guide.mdx index fb710c409c..b78fdbbd48 100644 --- a/docs/trusted-match/buyer-guide.mdx +++ b/docs/trusted-match/buyer-guide.mdx @@ -156,7 +156,7 @@ When an fcap rule changes — a window shortens or lengthens, a `max_count` rise Because Identity Match runs across all publishers using TMP, a user who saw your ad on Publisher A will correctly show as over-frequency on Publisher B — even though you can't see which publisher sent the request. -For the implementation details — the fcap_keys label model, the reference valkey data model, merge_rule semantics, audience and exposure record shapes, the SDK primitives, and Redis-command walkthroughs for the conformance scenarios — see [Identity Match implementation](/docs/trusted-match/identity-match-implementation). +For the implementation details — the fcap_keys label model, the reference valkey data model, audience and exposure record shapes, the SDK primitives, and conformance scenarios — see [Identity Match implementation](/docs/trusted-match/identity-match-implementation). ### How Buyers Learn About Exposures From 0557ae853cf7ad88b70695f4dcde44c3cc84b11e Mon Sep 17 00:00:00 2001 From: Oleksandr Halushchak <37289463+ohalushchak-exadel@users.noreply.github.com> Date: Wed, 20 May 2026 13:23:23 +0200 Subject: [PATCH 6/8] docs(tmp): add impression-tracker implementation reference (#4835) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds docs/trusted-match/impression-tracker-implementation.mdx — a non-normative reference for the impression-tracker side of the already-normative cap-fire boundary contract introduced in #4070. Re-introduces the impression_id model, fcap_keys label model, and log-based reference data model from the original bokelley/idmatch-design proposal — reframed as one valid way to implement the impression tracker behind the boundary, not as normative architecture. The protocol still only constrains the wire spec, the conformance invariants, and the cap-fire boundary contract. Cross-links from identity-match-implementation.mdx so readers landing on the data-flow page can find the buyer-internal reference. Co-authored-by: Claude Opus 4.7 (1M context) --- ...match-impression-tracker-impl-reference.md | 6 + .../identity-match-implementation.mdx | 1 + .../impression-tracker-implementation.mdx | 280 ++++++++++++++++++ 3 files changed, 287 insertions(+) create mode 100644 .changeset/idmatch-impression-tracker-impl-reference.md create mode 100644 docs/trusted-match/impression-tracker-implementation.mdx diff --git a/.changeset/idmatch-impression-tracker-impl-reference.md b/.changeset/idmatch-impression-tracker-impl-reference.md new file mode 100644 index 0000000000..b715467ed2 --- /dev/null +++ b/.changeset/idmatch-impression-tracker-impl-reference.md @@ -0,0 +1,6 @@ +--- +--- + +Add `docs/trusted-match/impression-tracker-implementation.mdx` — non-normative implementation reference for the impression tracker that sits behind the cap-fire boundary contract. Covers cross-identity dedup via `impression_id`, the `fcap_keys` label model, the log-based reference data model from `adcp-go/targeting/`, SDK primitives (`decodeTmpx` + `writeExposure`), production topology, and two end-to-end conformance scenarios (multi-identity dedup and cross-seller advertiser cap). Cross-links from `identity-match-implementation.mdx` so readers can find it. + +This re-introduces, as non-normative impl reference, the impression-tracker mechanics that were originally proposed as normative architecture in `bokelley/idmatch-design` but were superseded on `main` by the narrower cap-fire boundary contract (#4070). The boundary contract stays normative; this page documents one valid way to implement the impression tracker behind it. diff --git a/docs/trusted-match/identity-match-implementation.mdx b/docs/trusted-match/identity-match-implementation.mdx index d333d80162..1f1c36c37d 100644 --- a/docs/trusted-match/identity-match-implementation.mdx +++ b/docs/trusted-match/identity-match-implementation.mdx @@ -109,6 +109,7 @@ Today the cap-state store is keyed at `(user_identity, seller_agent_url, package ## See also - [TMP Specification](/docs/trusted-match/specification) — wire spec, TMPX format, conformance invariants +- [Impression Tracker Implementation Reference](/docs/trusted-match/impression-tracker-implementation) — non-normative reference for the impression-tracker side of the boundary (multi-identity dedup via `impression_id`, fcap_keys label model, log-based reference data model, SDK primitives) - [Buyer Guide](/docs/trusted-match/buyer-guide) — buyer agent integration, Context Match + Identity Match flows - [Migration from AXE](/docs/trusted-match/migration-from-axe) — for buyers transitioning from AXE-shaped pipelines, including the OpenRTB User.eids cross-walk - [Privacy architecture](/docs/trusted-match/privacy-architecture) — what each party learns diff --git a/docs/trusted-match/impression-tracker-implementation.mdx b/docs/trusted-match/impression-tracker-implementation.mdx new file mode 100644 index 0000000000..d5c6f7e66b --- /dev/null +++ b/docs/trusted-match/impression-tracker-implementation.mdx @@ -0,0 +1,280 @@ +--- +title: Impression Tracker Implementation Reference +sidebarTitle: Impression Tracker Reference +description: "Non-normative reference for the buyer-internal impression tracker — multi-identity dedup, fcap_keys label model, and the path from an impression pixel to a cap-fire entry at the Identity Match boundary." +"og:title": "AdCP TMP Impression Tracker Implementation Reference" +--- + +# Impression Tracker Implementation Reference + +This page is **non-normative reference content** for the impression tracker that sits behind the [Frequency-Cap Data Flow](/docs/trusted-match/identity-match-implementation) boundary. The protocol only constrains: + +- The wire spec — see the [TMP specification](/docs/trusted-match/specification). +- The conformance invariants the Identity Match service must satisfy — also normative in the [TMP specification](/docs/trusted-match/specification#conformance-invariants-for-identitymatch-eligibility). +- The cap-fire boundary contract — defined in [Frequency-Cap Data Flow](/docs/trusted-match/identity-match-implementation). + +Everything on this page is buyer-internal: how the impression tracker counts impressions, deduplicates across resolved identities, evaluates windows, and decides when a cap fires. Buyers running a conformant impression tracker may pick any approach that produces correct cap-fire events at the boundary. This page documents one such approach — the one implemented in [`adcp-go/targeting`](https://github.com/adcontextprotocol/adcp-go/tree/main/targeting) — so other implementers have a worked reference. + +## The cross-identity dedup problem + +A single impression on a user is often resolved to multiple identities (RampID, ID5, MAID, UID2, publisher-issued tokens, etc.) inside the same TMPX. A naive impression tracker that counts per-identity will count one impression as 2–3 against the user's caps. If the buyer runs an identity graph, the buyer can canonicalize identities before counting; if the buyer is graphless or partially graphed (common — Scope3's hosted Identity Match is graphless), no canonical id exists. + +Counter-based approaches paper over this with a `merge_rule` (MAX / OR / SUM) when reading per-identity counters. None of the merge rules is correct in general. The pathological case is identity-resolution toggling across impressions: some impressions resolve `rampid` only, some resolve both `rampid` and `id5`. A MAX-merged counter under-counts; SUM over-counts; OR can't represent more-than-one. The cap fires at the wrong time either way. + +The reference impl avoids the merge-rule problem entirely with an `impression_id` scheme: one id per impression, written to every resolved identity's log, deduplicated by id at read time. The count is exact regardless of whether identities are canonicalized upstream. + +## impression_id rules + +The impression tracker generates one `impression_id` per impression at TMPX decode time and writes it to every resolved identity's log. At read time, scanning all of a user's identity logs and deduplicating by `impression_id` recovers the distinct-impression count exactly. + +Required properties: + +1. **Globally unique across all sellers, sources, and time.** A buyer agent serves impressions sourced from many sellers. Collisions across sellers would silently merge distinct impressions and under-count the cap. Use UUIDv4 (≥122 bits randomness) or an equivalent collision-resistant generator. +2. **Generated by the buyer's impression tracker at TMPX decode** — not by the seller, the publisher, the router, or the TMPX nonce. The TMPX nonce is per-Identity-Match-evaluation and shared across all impressions in the serve window; seller- or publisher-supplied IDs would collide. +3. **One id per impression, written to ALL of the user's resolved identity logs for that impression.** Generating a different id per identity breaks the dedup contract — the same impression would count once per resolved identity. +4. **Pixel retries are a separate concern.** The same pixel firing twice (network retry, page refresh, etc.) MUST NOT mint two `impression_id`s. Either dedupe incoming requests by an idempotency key in the pixel URL or `Idempotency-Key` header, or accept a small over-count from retries as benign for fcap purposes. Cross-identity dedup and per-pixel idempotency are different problems with different mitigations. + +## fcap_keys label model + +Caps are tagged with `dimension:value` labels at impression-write time. Packages declare which labels they map to; fcap policies attach `(window_sec, max_count)` to each label. + +``` +package 2342: fcap_keys ["campaign:42", "campaign_group:7", "advertiser:13"] +policy "campaign:42": {window_sec: 60, max_count: 5} +policy "campaign_group:7": {window_sec: 86400, max_count: 50} +policy "advertiser:13": {window_sec: 86400, max_count: 20} +``` + +When the impression tracker writes an exposure for an impression on package 2342, the entry's `fcap_keys` is `["campaign:42", "campaign_group:7", "advertiser:13"]`. When evaluating whether a cap has fired, it scans the log for entries matching each label within that policy's window. + +**Charset constraint.** Each segment matches `[a-zA-Z0-9_-]+` so the `:` delimiter is unambiguous. URL-bearing or otherwise colon-bearing values must be hashed or shortened. + +**Multi-tenant operators** typically adopt a tenant prefix (`buyer-acme:campaign:42`) as a deployment convention to prevent key collisions across advertiser orgs on shared state. This is operator policy, not protocol. + +**Why labels, not hierarchy.** Cap dimensions are heterogeneous across customers — some cap at creative, some at line item, some at advertiser-roll-up. A fixed schema either over-prescribes or under-serves. Labels also make cross-seller caps automatic: any policy whose key is shared across sellers (e.g., `buyer-acme:advertiser:13`) enforces across all of them with no extra mode. Cross-cutting policies are explicit — a campaign that needs both per-campaign and per-advertiser caps declares both keys and gets two policy lookups. + +## Reference data model (valkey-backed, log-based) + +The layout below is what [`adcp-go/targeting`](https://github.com/adcontextprotocol/adcp-go/tree/main/targeting) uses. Any backend (Aerospike, DynamoDB, in-memory, anything) is fine; the data shape is the reference, not a requirement. + +### Exposure log (per identity) + +``` +type: STRING (binary-encoded []ExposureEntry, lazy-pruned to window) +key: user:exposures:{HashToken(uid_type + ":" + user_token)} +value: [ + { impression_id, fcap_keys[], timestamp }, + ... +] +``` + +`HashToken` is a 16-byte SHA-256 prefix, hex-encoded. Binary entry encoding keeps the log compact ([`exposure_binary.go`](https://github.com/adcontextprotocol/adcp-go/blob/main/targeting/exposure_binary.go)) — a 30-day log for a typical user is a few KB. + +Each entry records: + +- `impression_id` — generated at TMPX decode. Same value across all of this impression's identity logs. +- `fcap_keys[]` — the labels this impression counts toward. +- `timestamp` — unix seconds. + +### Fcap policy (per fcap_key) + +``` +type: STRING (JSON-encoded FcapPolicy) +key: fcap_policy:{fcap_key} +value: { window_sec, max_count, active, updated_at } +``` + +Sliding window applied at read by filtering `timestamp >= now - window_sec`. No FIXED/SLIDING toggle. + +### Package configuration (per package) + +``` +type: STRING (JSON-encoded PackageConfig) +key: package:identity:{package_id} +value: { + fcap_keys: ["campaign:42", "advertiser:13"], + active: true, + updated_at: +} +``` + +Maps package → fcap_keys. The impression tracker reads this to figure out which labels to tag a new exposure with. + +## Write path: pixel → log + +On a TMPX-bearing pixel fire, the impression tracker: + +1. Decodes the TMPX (HPKE decrypt + binary parse) → resolved identities + `(seller_agent_url, package_id)` context. +2. Looks up the package's `fcap_keys`. +3. Generates one `impression_id`. +4. For each resolved identity, appends `{impression_id, fcap_keys, timestamp}` to `user:exposures:{hash(identity)}`. Prunes entries older than the longest active window (default 30 days). + +The read-modify-write per identity is not atomic in the reference impl ([`engine.go:478`](https://github.com/adcontextprotocol/adcp-go/blob/main/targeting/engine.go#L478)) — concurrent writes for the same user can lose an exposure. The reference impl explicitly accepts this; under-counting under contention is benign for fcap purposes. Atomic append via Lua or a `Store.Append` extension is a deferred optimization. + +## Evaluating whether this impression exhausted a cap + +After writing the exposure, the impression tracker decides whether any cap just fired. For each `fcap_key` on the exposure, it scans the user's identity logs: + +1. Read `user:exposures:{h}` for every resolved identity. +2. Filter entries to `timestamp >= now - policy.window_sec` and `fcap_key ∈ entry.fcap_keys`. +3. Deduplicate by `impression_id` across all the user's identity logs. +4. Compare the deduped count to `policy.max_count`. + +If the deduped count is `>= max_count`, the cap fired on this impression. The impression tracker then writes a cap-fire entry to the Identity Match cap-state store for every `(user_identity, package_id)` whose package maps to the exhausted `fcap_key`. The expiration is `now + remaining_window`, where `remaining_window` is the window of the oldest deduped exposure still in scope. + +For a cap on an advertiser-level label (`advertiser:13`) that maps to multiple packages on multiple sellers, the impression tracker emits one cap-fire entry per `(user_identity, seller_agent_url, package_id)` affected — main's [boundary contract](/docs/trusted-match/identity-match-implementation#the-cap-fire-event) is package-scoped, so cross-dimensional caps fan out at write time. + +## SDK primitives + +The SDK ships impression handling as two composable functions, not one bundled call. Production tracking endpoints typically decode at intake and let a downstream worker write the store at its own pace; bundling decode+write into a single function would force synchronous topology and prevent buffering. + +``` +decodeTmpx(raw_tmpx) -> ExposureLog + Decrypts HPKE ciphertext, parses the published TMPX binary format + (/docs/trusted-match/specification#binary-format), returns the resolved + identity entries in a structured form ready for serialization onto a + topic or for direct write. + +writeExposure(log, fcap_keys, store_context) -> { ok, fired_caps } + Appends entries to each identity's exposure log with a fresh impression_id + and the supplied fcap_keys. Prunes entries older than the longest active + window. Returns the set of caps that fired on this impression — the + caller fans these out to the Identity Match cap-state store. +``` + +Plus the buyer-side management plane: + +``` +upsertPackage(seller_agent_url, package_id, fcap_keys, opts) +upsertFcapPolicy(fcap_key, {window_sec, max_count}) +inspectExposures(uid_type, user_token, fcap_key?) // debugging helper +``` + +Plus HPKE encrypt/decrypt as net-new SDK primitives (X25519 KEM, ChaCha20-Poly1305, HKDF-SHA256 per RFC 9180 `mode_base`). Encrypt is needed by the Identity Match service emitting TMPX; decrypt by the impression tracker invoking `decodeTmpx`. + +The same surface ships in `@adcp/client` (TS), `adcp-go`, and `adcp` (Python). + +## Production topology pattern + +A typical Scope3-style deployment: + +``` +publisher pixel fires {TMPX} → tracking endpoint + │ + decodeTmpx (synchronous, at intake) + │ + ▼ + pub/sub topic + │ + frequency_writer worker + │ + writeExposure (asynchronous) + │ + ▼ + valkey (exposure log) + │ + if cap fired → RecordCap to + Identity Match cap-state store +``` + +Decode at intake; emit to pub/sub for buffering; downstream worker writes the exposure log and emits any cap-fire events. Buffering, retries, dedup, observability, and abuse protection live at the queue layer — none of that is the SDK's job. A simpler synchronous pipeline (decode + write in the same handler) is also valid for low-volume deployments. + +## Conformance scenarios + +These walk through impression-tracker behavior end-to-end. They are buyer-internal mechanics; the on-wire observable is whatever cap-fire entries land in the Identity Match cap-state store, which surfaces as eligibility decisions in later `identity_match_request` calls. + +Setup for both scenarios: `package = "pkg-42"` on `seller-a.example`, `fcap_keys: ["campaign:42"]`, `policy campaign:42 = {window_sec: 86400, max_count: 5}`. + +### Scenario A — multi-identity dedup + +User has two resolved identities: `rampid:abc` and `id5:def`. + +**Three impressions, each TMPX resolves both identities.** Each impression writes the same `impression_id` to both identity logs: + +``` +user:exposures: = [ + { impression_id: "imp-001", fcap_keys: ["campaign:42"], ts: ... }, + { impression_id: "imp-002", fcap_keys: ["campaign:42"], ts: ... }, + { impression_id: "imp-003", fcap_keys: ["campaign:42"], ts: ... } +] +user:exposures: = [ same three entries ] +``` + +At the third write, the impression tracker checks: union both logs, dedupe by `impression_id` → 3 distinct impressions. Under cap of 5 → no cap-fire entry emitted. + +**Three more impressions, only `rampid:abc` resolves (id5 lookup fails).** Logs after the 6th impression: + +``` +user:exposures: += [ imp-004, imp-005, imp-006 ] +user:exposures: unchanged +``` + +At write of imp-005 (the 5th distinct impression), the deduped count is 5 = `max_count` → the cap just exhausted. The impression tracker emits a cap-fire entry to the Identity Match cap-state store for both identities: + +``` +RecordCap(rampid:abc, [{seller-a.example, pkg-42}], expire_at) +RecordCap(id5:def, [{seller-a.example, pkg-42}], expire_at) +``` + +A counter-based tracker with MAX merge_rule would have counted `max(rampid, id5) = max(6, 3) = 6` only after imp-006, and would have over-served by one impression — or under-counted in the reverse pathological case. The log + `impression_id` dedup gets the count right regardless of identity-resolution stability. + +### Scenario B — cross-seller advertiser cap + +Two packages on different sellers, both mapped to the same advertiser-level label: + +``` +package:identity:pkg-A = { fcap_keys: ["advertiser:13"], active: true } // seller-a +package:identity:pkg-B = { fcap_keys: ["advertiser:13"], active: true } // seller-b +fcap_policy:advertiser:13 = { window_sec: 86400, max_count: 10 } +``` + +Ten impressions on `pkg-A` from `seller-a`. Each exposure entry's `fcap_keys` includes `advertiser:13`. At the 10th write, the deduped count for `advertiser:13` matches `max_count`. The impression tracker emits cap-fire entries for **every package mapped to `advertiser:13` across all sellers**, for every resolved identity: + +``` +RecordCap(, [ + {seller-a.example, pkg-A}, + {seller-b.example, pkg-B}, +], expire_at) +``` + +A subsequent `identity_match_request` from `seller-b` for `pkg-B` returns `eligible_package_ids: []` because the cap-state entry is present. The advertiser-level cap enforces across sellers because the `fcap_key` is shared. No cross-seller coordination is required at the IdentityMatch service — the buyer agent's impression tracker is the single source of truth, and the cap-state store is the publication channel. + +## Performance reference + +Numbers below are from [`targeting/scale_test.go`](https://github.com/adcontextprotocol/adcp-go/blob/main/targeting/scale_test.go) against the in-memory mock store, single goroutine. They isolate CPU from network. They describe the **impression tracker's** evaluation cost — the cost of scanning logs and deciding whether this impression just fired a cap. The Identity Match service's at-query-time cost is a separate, much smaller cap-state presence check. + +**Per-eval at write time, varying log size, single identity, single fcap_key:** + +| Prior exposures in user's log | Eval latency | +|---|---| +| 0 | 368 ns | +| 100 | 5.3 µs | +| 1,000 | 53 µs | +| 10,000 | 118 µs | + +Linear scan with binary lazy dedup; sub-millisecond at 10K entries. + +**Combined load (multi-identity, multi-package eval), varying all dimensions:** + +| packages mapped via fcap_keys | log entries / id | identities | CPU/eval | +|---|---|---|---| +| 100 | 1,000 | 3 | 1.0 ms | +| 1,000 | 1,000 | 3 | 7.5 ms ← realistic Scope3-shape load | +| 1,000 | 10,000 | 3 | 58 ms ← pathological tail (heavy users) | + +CPU scales in `packages × log_entries × identities`. The pathological tail is addressed by the algorithmic optimization in [adcp-go#103](https://github.com/adcontextprotocol/adcp-go/pull/103) (heuristic-gated prefilter bucket; gated at `numPackages > 50` to avoid regressions on small requests): + +| packages | log entries | identities | Before | After | Speedup | +|----------|------------:|-----------:|----------:|---------:|--------:| +| 1,000 | 100 | 3 | 784 µs | 71 µs | 11.0× | +| 1,000 | 1,000 | 3 | 7,566 µs | 287 µs | 26.4× | +| 1,000 | 10,000 | 3 | 57,861 µs | 1,500 µs | ~38× | + +Production sizing also depends on valkey round-trip latency, tail behavior under load, and the heavy-user impression-distribution shape. Mock-store CPU is the floor, not the production number. + +## See also + +- [Frequency-Cap Data Flow](/docs/trusted-match/identity-match-implementation) — the cap-fire boundary contract this page sits behind +- [TMP Specification](/docs/trusted-match/specification) — wire spec, conformance invariants +- [`adcp-go/targeting`](https://github.com/adcontextprotocol/adcp-go/tree/main/targeting) — reference Go implementation of the model on this page +- [`adcp-go/targeting/fcap`](https://github.com/adcontextprotocol/adcp-go/tree/main/targeting/fcap) — reference cap-state store on the other side of the boundary From 39bee13d3f5207b50e0891f5674d77ca8f018742 Mon Sep 17 00:00:00 2001 From: Oleksandr Halushchak Date: Wed, 20 May 2026 13:37:25 +0200 Subject: [PATCH 7/8] docs(tmp): apply review feedback from #3359 aao-release-bot Follow-ups: - index.json: fold the trusted-match impl pointer into the existing `description` field and repoint at the canonical docs (specification.mdx for conformance invariants, identity-match-implementation.mdx for the cap-fire boundary contract, impression-tracker-implementation.mdx for the non-normative impl reference). Drops the net-new `implementation-guidance` sibling shape that only appeared in this one protocol block. - impression-tracker-implementation.mdx Scenario A: restructured so the cap fires at imp-005 with both identities resolved. Reading both logs and dedup'ing by impression_id produces the count of 5 (rampid:abc log = 5, id5:def log = 4); cap-fire entries for both identities are then consistent with rule #3. Earlier wording had imp-005 with only rampid:abc resolved while still emitting a RecordCap for id5:def, contradicting that rule. Nits: - buyer-guide.mdx: repoint the impl-details cross-link at impression-tracker-implementation (where the fcap_keys label model, reference valkey data model, SDK primitives, and conformance scenarios actually live), plus a secondary link to the boundary contract. - impression-tracker-implementation.mdx: rename `decodeTmpx`'s return type to `DecodedExposures` so it no longer overloads the `ExposureLog` type name used for the persistent per-identity log. - impression-tracker-implementation.mdx: lowercase the RFC-2119 MUST NOT in the pixel-retry rule since this page is non-normative (the conformance-citable contract lives on the Frequency-Cap Data Flow page). - impression-tracker-implementation.mdx: add a disclaimer that SDK primitive names are illustrative, with canonical signatures landing via the SDK RFCs. Co-Authored-By: Claude Opus 4.7 (1M context) --- docs/trusted-match/buyer-guide.mdx | 2 +- .../impression-tracker-implementation.mdx | 64 ++++++++++++------- static/schemas/source/index.json | 5 +- 3 files changed, 44 insertions(+), 27 deletions(-) diff --git a/docs/trusted-match/buyer-guide.mdx b/docs/trusted-match/buyer-guide.mdx index b78fdbbd48..1288f3cd6d 100644 --- a/docs/trusted-match/buyer-guide.mdx +++ b/docs/trusted-match/buyer-guide.mdx @@ -156,7 +156,7 @@ When an fcap rule changes — a window shortens or lengthens, a `max_count` rise Because Identity Match runs across all publishers using TMP, a user who saw your ad on Publisher A will correctly show as over-frequency on Publisher B — even though you can't see which publisher sent the request. -For the implementation details — the fcap_keys label model, the reference valkey data model, audience and exposure record shapes, the SDK primitives, and conformance scenarios — see [Identity Match implementation](/docs/trusted-match/identity-match-implementation). +For the implementation details — the fcap_keys label model, the reference valkey data model, exposure record shapes, the SDK primitives, and conformance scenarios — see [Impression Tracker Implementation Reference](/docs/trusted-match/impression-tracker-implementation). The boundary contract that sits between the impression tracker and the Identity Match service is at [Frequency-Cap Data Flow](/docs/trusted-match/identity-match-implementation). ### How Buyers Learn About Exposures diff --git a/docs/trusted-match/impression-tracker-implementation.mdx b/docs/trusted-match/impression-tracker-implementation.mdx index d5c6f7e66b..4250a1c6ba 100644 --- a/docs/trusted-match/impression-tracker-implementation.mdx +++ b/docs/trusted-match/impression-tracker-implementation.mdx @@ -32,7 +32,7 @@ Required properties: 1. **Globally unique across all sellers, sources, and time.** A buyer agent serves impressions sourced from many sellers. Collisions across sellers would silently merge distinct impressions and under-count the cap. Use UUIDv4 (≥122 bits randomness) or an equivalent collision-resistant generator. 2. **Generated by the buyer's impression tracker at TMPX decode** — not by the seller, the publisher, the router, or the TMPX nonce. The TMPX nonce is per-Identity-Match-evaluation and shared across all impressions in the serve window; seller- or publisher-supplied IDs would collide. 3. **One id per impression, written to ALL of the user's resolved identity logs for that impression.** Generating a different id per identity breaks the dedup contract — the same impression would count once per resolved identity. -4. **Pixel retries are a separate concern.** The same pixel firing twice (network retry, page refresh, etc.) MUST NOT mint two `impression_id`s. Either dedupe incoming requests by an idempotency key in the pixel URL or `Idempotency-Key` header, or accept a small over-count from retries as benign for fcap purposes. Cross-identity dedup and per-pixel idempotency are different problems with different mitigations. +4. **Pixel retries are a separate concern.** The same pixel firing twice (network retry, page refresh, etc.) must not mint two `impression_id`s — minting two would let pixel retries double-count against the cap. Either dedupe incoming requests by an idempotency key in the pixel URL or `Idempotency-Key` header, or accept a small over-count from retries as benign for fcap purposes. Cross-identity dedup and per-pixel idempotency are different problems with different mitigations. (Lowercase wording: this page is non-normative; the boundary contract on the [Frequency-Cap Data Flow](/docs/trusted-match/identity-match-implementation) page is what conformance tests cite.) ## fcap_keys label model @@ -129,17 +129,19 @@ For a cap on an advertiser-level label (`advertiser:13`) that maps to multiple p The SDK ships impression handling as two composable functions, not one bundled call. Production tracking endpoints typically decode at intake and let a downstream worker write the store at its own pace; bundling decode+write into a single function would force synchronous topology and prevent buffering. ``` -decodeTmpx(raw_tmpx) -> ExposureLog +decodeTmpx(raw_tmpx) -> DecodedExposures Decrypts HPKE ciphertext, parses the published TMPX binary format (/docs/trusted-match/specification#binary-format), returns the resolved identity entries in a structured form ready for serialization onto a - topic or for direct write. - -writeExposure(log, fcap_keys, store_context) -> { ok, fired_caps } - Appends entries to each identity's exposure log with a fresh impression_id - and the supplied fcap_keys. Prunes entries older than the longest active - window. Returns the set of caps that fired on this impression — the - caller fans these out to the Identity Match cap-state store. + topic or for direct write. The persistent per-identity exposure log + is a separate, store-resident structure — see Reference data model above. + +writeExposure(decoded, fcap_keys, store_context) -> { ok, fired_caps } + Appends entries to each resolved identity's exposure log with a fresh + impression_id and the supplied fcap_keys. Prunes entries older than the + longest active window. Returns the set of caps that fired on this + impression — the caller fans these out to the Identity Match cap-state + store. ``` Plus the buyer-side management plane: @@ -154,6 +156,8 @@ Plus HPKE encrypt/decrypt as net-new SDK primitives (X25519 KEM, ChaCha20-Poly13 The same surface ships in `@adcp/client` (TS), `adcp-go`, and `adcp` (Python). +> **Primitive names are illustrative.** `decodeTmpx`, `writeExposure`, `upsertPackage`, `upsertFcapPolicy`, and `inspectExposures` describe the shape of the SDK surface; canonical signatures land with the corresponding SDK RFCs and may differ in naming or argument order. Treat this section as the impression-tracker decomposition, not as an API contract. + ## Production topology pattern A typical Scope3-style deployment: @@ -187,36 +191,50 @@ Setup for both scenarios: `package = "pkg-42"` on `seller-a.example`, `fcap_keys ### Scenario A — multi-identity dedup -User has two resolved identities: `rampid:abc` and `id5:def`. +User has two resolved identities across the impression stream: `rampid:abc` and `id5:def`. Identity resolution toggles — most impressions resolve both, but one resolves rampid only. -**Three impressions, each TMPX resolves both identities.** Each impression writes the same `impression_id` to both identity logs: +**imp-001, imp-002, imp-003** — TMPX resolves both identities. Each impression writes the same `impression_id` to both logs: ``` -user:exposures: = [ - { impression_id: "imp-001", fcap_keys: ["campaign:42"], ts: ... }, - { impression_id: "imp-002", fcap_keys: ["campaign:42"], ts: ... }, - { impression_id: "imp-003", fcap_keys: ["campaign:42"], ts: ... } -] -user:exposures: = [ same three entries ] +user:exposures: = [ imp-001, imp-002, imp-003 ] +user:exposures: = [ imp-001, imp-002, imp-003 ] ``` -At the third write, the impression tracker checks: union both logs, dedupe by `impression_id` → 3 distinct impressions. Under cap of 5 → no cap-fire entry emitted. +**imp-004** — TMPX resolves rampid only (id5 lookup fails). imp-004 is written to rampid's log only: -**Three more impressions, only `rampid:abc` resolves (id5 lookup fails).** Logs after the 6th impression: +``` +user:exposures: = [ imp-001..imp-004 ] +user:exposures: = [ imp-001..imp-003 ] unchanged +``` + +**imp-005** — TMPX resolves both identities again. imp-005 is written to both logs. The impression tracker then evaluates the cap by reading both resolved-identity logs: ``` -user:exposures: += [ imp-004, imp-005, imp-006 ] -user:exposures: unchanged +rampid:abc log: { imp-001, imp-002, imp-003, imp-004, imp-005 } = 5 entries +id5:def log: { imp-001, imp-002, imp-003, imp-005 } = 4 entries ``` -At write of imp-005 (the 5th distinct impression), the deduped count is 5 = `max_count` → the cap just exhausted. The impression tracker emits a cap-fire entry to the Identity Match cap-state store for both identities: +Union the entries across logs, deduplicate by `impression_id`: + +``` +{ imp-001, imp-002, imp-003, imp-004, imp-005 } = 5 distinct impressions +``` + +5 = `max_count` → the cap just exhausted. Since both identities are resolved on imp-005, the impression tracker emits cap-fire entries for both: ``` RecordCap(rampid:abc, [{seller-a.example, pkg-42}], expire_at) RecordCap(id5:def, [{seller-a.example, pkg-42}], expire_at) ``` -A counter-based tracker with MAX merge_rule would have counted `max(rampid, id5) = max(6, 3) = 6` only after imp-006, and would have over-served by one impression — or under-counted in the reverse pathological case. The log + `impression_id` dedup gets the count right regardless of identity-resolution stability. +Two things are demonstrated: + +- **Dedup matters.** Naively summing per-identity counts gives `5 + 4 = 9` — way over `max_count`. Dedup by `impression_id` recovers the correct count of 5. +- **Identity-resolution stability isn't required.** imp-004 missed id5's log entirely; dedup at evaluation time still produces the right answer when both identities are next resolved together. + +A counter-based tracker with a MAX merge_rule would see counters `max(rampid=5, id5=4) = 5` here — coincidentally correct at this point, but only because the divergence happened to be a single missed write. A second missed-id5 impression (imp-006-style) would push rampid to 6 while leaving id5 at 5; MAX would still say 5 and over-serve by one. SUM over-counts in the opposite direction. The log + `impression_id` dedup is correct by construction. + +A consequence to flag for the implementer: if a future query resolves only id5:def, the cap-state lookup hits the id5:def entry written at imp-005 and the user is correctly suppressed. If neither identity gets resolved in a future query, no cap-state lookup happens at all — that's an identity-resolution problem upstream of fcap, not a fcap correctness problem. ### Scenario B — cross-seller advertiser cap diff --git a/static/schemas/source/index.json b/static/schemas/source/index.json index 6950dd11e6..ecbdbd7b02 100644 --- a/static/schemas/source/index.json +++ b/static/schemas/source/index.json @@ -1543,7 +1543,7 @@ "purpose": "Declares brand identity and agent for a domain, enabling brand discovery and verification" }, "trusted-match": { - "description": "Trusted Match Protocol (TMP) — real-time execution layer for activating pre-negotiated packages across any surface", + "description": "Trusted Match Protocol (TMP) — real-time execution layer for activating pre-negotiated packages across any surface. Conformance invariants are normative in docs/trusted-match/specification.mdx; the cap-fire boundary contract is at docs/trusted-match/identity-match-implementation.mdx; a non-normative impression-tracker implementation reference (multi-identity dedup, fcap_keys labels, log-based data model, SDK primitives) is at docs/trusted-match/impression-tracker-implementation.mdx. Storage backend is an implementation choice; conformant services may use any store that satisfies the invariants.", "supporting-schemas": { "available-package": { "$ref": "/schemas/tmp/available-package.json", @@ -1587,8 +1587,7 @@ "description": "Per-package eligibility — boolean eligible plus optional intent score" } } - }, - "implementation-guidance": "Conformance invariants and a reference (non-normative) valkey-backed buyer-side data model are documented in specs/identitymatch-fcap-architecture.md. Storage backend is an implementation choice; conformant services may use any store that satisfies the invariants." + } }, "brand-protocol": { "description": "Brand protocol for identity retrieval, rights discovery, acquisition, and lifecycle management", From 1efbfb381cb2eab684c43371ba658499f4a8f50f Mon Sep 17 00:00:00 2001 From: Oleksandr Halushchak Date: Fri, 22 May 2026 14:36:23 +0200 Subject: [PATCH 8/8] docs(tmp): align fcap policy shape with production, clarify any-policy-fires semantics MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Addresses two review comments from @BaiyuScope3 on PR #3359 (impression-tracker owner): r3274986418 — FcapPolicy shape didn't match production: - Replace `{window_sec, max_count}` with the actual production shape `{window: {interval, unit}, max_impression_count}` matching the advertiser-facing config (unit ∈ minutes/hours/days/weeks/months). - Add a "window unit is load-bearing" callout: unit drives the sliding-window bucket size, which affects post-cap re-evaluation cadence. `{interval: 2, unit: hours}` and `{interval: 120, unit: minutes}` have the same window length but different re-evaluation cadence (next-hour vs next-minute bucket boundary). - Update read-time filter description from "timestamp >= now - window_sec" to bucket-based filter that reflects how production actually counts. - Update cap-fire expiration from "now + remaining_window" to "end of the current bucket of policy.window". r3276468521 — multi-policy fire semantics weren't explicit: - Lead the eval section with the fact that a package typically maps to multiple fcap_keys, each with its own policy, and the cap fires as soon as ANY policy reaches its max_impression_count. Also updates Scenarios A and B + the SDK `upsertFcapPolicy` signature so naming is consistent across the doc. The MAX-merge-rule counter-example callout now also shows the SUM number (9) for symmetry. Co-Authored-By: Claude Opus 4.7 (1M context) --- .../impression-tracker-implementation.mdx | 38 ++++++++++--------- 1 file changed, 21 insertions(+), 17 deletions(-) diff --git a/docs/trusted-match/impression-tracker-implementation.mdx b/docs/trusted-match/impression-tracker-implementation.mdx index 4250a1c6ba..a029d9613b 100644 --- a/docs/trusted-match/impression-tracker-implementation.mdx +++ b/docs/trusted-match/impression-tracker-implementation.mdx @@ -36,17 +36,19 @@ Required properties: ## fcap_keys label model -Caps are tagged with `dimension:value` labels at impression-write time. Packages declare which labels they map to; fcap policies attach `(window_sec, max_count)` to each label. +Caps are tagged with `dimension:value` labels at impression-write time. Packages declare which labels they map to; fcap policies attach a `window` and a `max_impression_count` to each label. ``` package 2342: fcap_keys ["campaign:42", "campaign_group:7", "advertiser:13"] -policy "campaign:42": {window_sec: 60, max_count: 5} -policy "campaign_group:7": {window_sec: 86400, max_count: 50} -policy "advertiser:13": {window_sec: 86400, max_count: 20} +policy "campaign:42": {window: {interval: 10, unit: "minutes"}, max_impression_count: 5} +policy "campaign_group:7": {window: {interval: 1, unit: "days"}, max_impression_count: 50} +policy "advertiser:13": {window: {interval: 1, unit: "days"}, max_impression_count: 20} ``` When the impression tracker writes an exposure for an impression on package 2342, the entry's `fcap_keys` is `["campaign:42", "campaign_group:7", "advertiser:13"]`. When evaluating whether a cap has fired, it scans the log for entries matching each label within that policy's window. +**Window unit is load-bearing**, not just human-readable shorthand. The reference impl uses `unit` as the sliding-window bucket size: `unit: "hours"` evaluates against hourly buckets; `unit: "minutes"` evaluates against minute buckets. Two policies that look duration-equivalent — `{interval: 2, unit: "hours"}` vs `{interval: 120, unit: "minutes"}` — have the **same window length** but **different post-cap re-evaluation cadence**. After a user hits the 2-hour-bucket cap, the next eligibility check that admits new traffic happens at the next-hour bucket boundary; for the 120-minute-bucket policy, it happens at the next-minute bucket boundary. Pick `unit` to match the cadence you want, not the duration you can fit in the smaller number. + **Charset constraint.** Each segment matches `[a-zA-Z0-9_-]+` so the `:` delimiter is unambiguous. URL-bearing or otherwise colon-bearing values must be hashed or shortened. **Multi-tenant operators** typically adopt a tenant prefix (`buyer-acme:campaign:42`) as a deployment convention to prevent key collisions across advertiser orgs on shared state. This is operator policy, not protocol. @@ -81,10 +83,10 @@ Each entry records: ``` type: STRING (JSON-encoded FcapPolicy) key: fcap_policy:{fcap_key} -value: { window_sec, max_count, active, updated_at } +value: { window: {interval, unit}, max_impression_count, active, updated_at } ``` -Sliding window applied at read by filtering `timestamp >= now - window_sec`. No FIXED/SLIDING toggle. +Sliding window applied at read by counting log entries that fall in the current and prior buckets that span the window. Bucket size is derived from `window.unit` (`minutes`/`hours`/`days`/`weeks`/`months`); window length is `interval × unit`. The bucket-level filter, not a per-second `>=` filter on entry timestamps, is what production uses — it makes re-evaluation cadence after a cap fires predictable from the policy's `unit`. ### Package configuration (per package) @@ -113,14 +115,16 @@ The read-modify-write per identity is not atomic in the reference impl ([`engine ## Evaluating whether this impression exhausted a cap -After writing the exposure, the impression tracker decides whether any cap just fired. For each `fcap_key` on the exposure, it scans the user's identity logs: +After writing the exposure, the impression tracker decides whether any cap just fired. **A package typically maps to multiple `fcap_keys` (campaign, campaign_group, advertiser, …), each with its own policy. Policies are evaluated independently, and the cap fires when *any one* of them reaches `max_impression_count` within its window.** A user can be capped on a package by the per-campaign policy without ever approaching the per-advertiser policy, or vice versa. + +For each `fcap_key` on the exposure, the impression tracker scans the user's identity logs: 1. Read `user:exposures:{h}` for every resolved identity. -2. Filter entries to `timestamp >= now - policy.window_sec` and `fcap_key ∈ entry.fcap_keys`. +2. Filter entries to those that fall in the current+prior buckets spanning `policy.window` and where `fcap_key ∈ entry.fcap_keys`. 3. Deduplicate by `impression_id` across all the user's identity logs. -4. Compare the deduped count to `policy.max_count`. +4. Compare the deduped count to `policy.max_impression_count`. -If the deduped count is `>= max_count`, the cap fired on this impression. The impression tracker then writes a cap-fire entry to the Identity Match cap-state store for every `(user_identity, package_id)` whose package maps to the exhausted `fcap_key`. The expiration is `now + remaining_window`, where `remaining_window` is the window of the oldest deduped exposure still in scope. +If any policy's deduped count is `>= max_impression_count`, the cap fired on this impression. The impression tracker then writes a cap-fire entry to the Identity Match cap-state store for every `(user_identity, package_id)` whose package maps to the exhausted `fcap_key`. The expiration is the end of the current bucket of `policy.window` (which is when the oldest in-scope exposure ages out under bucket semantics). For a cap on an advertiser-level label (`advertiser:13`) that maps to multiple packages on multiple sellers, the impression tracker emits one cap-fire entry per `(user_identity, seller_agent_url, package_id)` affected — main's [boundary contract](/docs/trusted-match/identity-match-implementation#the-cap-fire-event) is package-scoped, so cross-dimensional caps fan out at write time. @@ -148,7 +152,7 @@ Plus the buyer-side management plane: ``` upsertPackage(seller_agent_url, package_id, fcap_keys, opts) -upsertFcapPolicy(fcap_key, {window_sec, max_count}) +upsertFcapPolicy(fcap_key, {window: {interval, unit}, max_impression_count}) inspectExposures(uid_type, user_token, fcap_key?) // debugging helper ``` @@ -187,7 +191,7 @@ Decode at intake; emit to pub/sub for buffering; downstream worker writes the ex These walk through impression-tracker behavior end-to-end. They are buyer-internal mechanics; the on-wire observable is whatever cap-fire entries land in the Identity Match cap-state store, which surfaces as eligibility decisions in later `identity_match_request` calls. -Setup for both scenarios: `package = "pkg-42"` on `seller-a.example`, `fcap_keys: ["campaign:42"]`, `policy campaign:42 = {window_sec: 86400, max_count: 5}`. +Setup for both scenarios: `package = "pkg-42"` on `seller-a.example`, `fcap_keys: ["campaign:42"]`, `policy campaign:42 = {window: {interval: 1, unit: "days"}, max_impression_count: 5}`. ### Scenario A — multi-identity dedup @@ -220,7 +224,7 @@ Union the entries across logs, deduplicate by `impression_id`: { imp-001, imp-002, imp-003, imp-004, imp-005 } = 5 distinct impressions ``` -5 = `max_count` → the cap just exhausted. Since both identities are resolved on imp-005, the impression tracker emits cap-fire entries for both: +5 = `max_impression_count` → the cap just exhausted. Since both identities are resolved on imp-005, the impression tracker emits cap-fire entries for both: ``` RecordCap(rampid:abc, [{seller-a.example, pkg-42}], expire_at) @@ -229,10 +233,10 @@ RecordCap(id5:def, [{seller-a.example, pkg-42}], expire_at) Two things are demonstrated: -- **Dedup matters.** Naively summing per-identity counts gives `5 + 4 = 9` — way over `max_count`. Dedup by `impression_id` recovers the correct count of 5. +- **Dedup matters.** Naively summing per-identity counts gives `5 + 4 = 9` — way over `max_impression_count`. Dedup by `impression_id` recovers the correct count of 5. - **Identity-resolution stability isn't required.** imp-004 missed id5's log entirely; dedup at evaluation time still produces the right answer when both identities are next resolved together. -A counter-based tracker with a MAX merge_rule would see counters `max(rampid=5, id5=4) = 5` here — coincidentally correct at this point, but only because the divergence happened to be a single missed write. A second missed-id5 impression (imp-006-style) would push rampid to 6 while leaving id5 at 5; MAX would still say 5 and over-serve by one. SUM over-counts in the opposite direction. The log + `impression_id` dedup is correct by construction. +A counter-based tracker with a MAX merge_rule would see counters `max(rampid=5, id5=4) = 5` here — coincidentally correct at this point, but only because the divergence happened to be a single missed write. A second missed-id5 impression (imp-006-style) would push rampid to 6 while leaving id5 at 5; MAX would still say 5 and over-serve by one. SUM (= 9 here) over-counts in the opposite direction. The log + `impression_id` dedup is correct by construction. A consequence to flag for the implementer: if a future query resolves only id5:def, the cap-state lookup hits the id5:def entry written at imp-005 and the user is correctly suppressed. If neither identity gets resolved in a future query, no cap-state lookup happens at all — that's an identity-resolution problem upstream of fcap, not a fcap correctness problem. @@ -243,10 +247,10 @@ Two packages on different sellers, both mapped to the same advertiser-level labe ``` package:identity:pkg-A = { fcap_keys: ["advertiser:13"], active: true } // seller-a package:identity:pkg-B = { fcap_keys: ["advertiser:13"], active: true } // seller-b -fcap_policy:advertiser:13 = { window_sec: 86400, max_count: 10 } +fcap_policy:advertiser:13 = { window: {interval: 1, unit: "days"}, max_impression_count: 10 } ``` -Ten impressions on `pkg-A` from `seller-a`. Each exposure entry's `fcap_keys` includes `advertiser:13`. At the 10th write, the deduped count for `advertiser:13` matches `max_count`. The impression tracker emits cap-fire entries for **every package mapped to `advertiser:13` across all sellers**, for every resolved identity: +Ten impressions on `pkg-A` from `seller-a`. Each exposure entry's `fcap_keys` includes `advertiser:13`. At the 10th write, the deduped count for `advertiser:13` matches `max_impression_count`. The impression tracker emits cap-fire entries for **every package mapped to `advertiser:13` across all sellers**, for every resolved identity: ``` RecordCap(, [