feat(comptroller): real Workers-AI (T0) cost insights at GET /api/v1/insights by chitcommit · Pull Request #83 · chittyos/chittyops

chitcommit · 2026-06-11T01:27:43Z

What

Adds a real AI-categorization + deeper-insight endpoint to the live ChittyComptroller worker, using Cloudflare Workers-AI (T0) — SPEC-compliant (@cf/* models are T0; this NEVER uses an LLM above T0). Replaces reliance on static COA mapping with genuine AI insight grounded in real chittyops.cost_ledger data.

Endpoint

GET /api/v1/insights (?refresh=1 to bypass cache)

Design — SQL owns every number, the model owns only prose

queryInsightsAggregates() computes all figures in SQL: today + all-time spend, per-service+tier+provider, 7-day daily trend, top models by cost and by call volume, and the workers-ai vs external-provider split.
runInsightsModel() feeds those finished figures to @cf/meta/llama-3.1-8b-instruct once (not per-row) and asks for narrative-only fields. The prompt forbids inventing/restating costs or editorializing magnitude.
Numeric fields in the response come straight from the queries; only prose comes from the model → "grounded, no fabrication" is structural, not prompt-dependent.
Parse failures surface raw model text (narrative_error + model_raw) — no fabricated fallback.
Cached ~6h in KV_STATE (insights:{chicago-date}); never runs on the 5-min poll (avoids meta-cost).
Empty-state: zero rows returns a clear empty result and skips the model entirely.

Response shape

{generated_at, window, totals, per_service[], drivers[], trends[], recommendations[], daily_trend[], top_models_by_cost[], top_models_by_calls[], model_used}

wrangler.toml

Adds [ai] binding = "AI" — free, on-account, no new secret.

Live verification

Deployed (version f10db6b3-1aa9-4f06-9167-559a2615bb2b) and curled live at comptroller.chitty.cc/api/v1/insights. Figures match a direct Neon query exactly:

Figure	Endpoint	Neon
today cost	0.077455	0.077455
today calls	2055	2055
all-time cost	0.238430	0.238430
all-time rows	4471	4471
qwen3-embedding 7d cost	0.069217	0.069217

The model correctly characterized chittycounsel as an embedding-heavy workload (qwen3-embedding, high tokens_in / ~0 tokens_out) and flagged the real 6/8 → 6/10 cost ramp as the notable trend.

🤖 Generated with Claude Code

…t ledger Replace stubbed worker internals with real implementations: - getDb/getWriteDb helpers over porsager `postgres` driver on Hyperdrive connectionString (the old `env.NEON_COMPTROLLER.query()` was fictional). - pullCFAIGatewayAnalytics: real CF AI Gateway /logs ingest for the 4 active gateways with KV high-water dedup, bounded pagination, batch INSERT into chittyops.cost_ledger; tolerant per-gateway failure. - tierFromModel maps to CHECK-constraint-valid tiers (T0/T3_opus/T3_sonnet/ T2_haiku/manual) — validated on a Neon temp branch (caught a tier-CHECK bug). - detectAnomalies/isServiceExempt/budgetStatus/refreshCostLedgerView refactored to the real driver; matview refresh fail-soft on privilege. - /api/v1/metrics, fetchDailyReport, listAnomalies, checkHardCaps: real queries. - storeAnomalies/listAnomalies hit chittyops.anomalies (fixed stale comment). - signHmac: real HMAC-SHA256, fail-closed when key absent. - Notion + Quo emitters: real not-configured guards (Phase B), no fake content. - Cold-start + 14d baseline-learning KV state set on first run (safe-state). WRITER-CONNECTION BLOCKER (Phase A): the Hyperdrive binding is read-only (comptroller_reader). cost_ledger/anomalies writes require a SEPARATE RW Hyperdrive binding (NEON_COMPTROLLER_WRITER). Until provisioned, getWriteDb() fails closed and ingest is skipped (logged), so the poll never errors. Read path validated live: /api/v1/metrics returns real total_count=0 matching Neon. Both INSERT column lists schema-validated on a disposable Neon branch. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…B clients NEON_COMPTROLLER_WRITER Hyperdrive (4427ea04, comptroller_writer role, append-only INSERT on cost_ledger+anomalies). Refactor DB access to per-invocation postgres clients via AsyncLocalStorage scope, ending them with ctx.waitUntil to avoid stale Hyperdrive clients across cron isolate reuse. Verified live: cost_ledger 0 -> 1900+ rows across chittygateway + chittycounsel, real cost/token mapping (chittycounsel $0.063 captured). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…insights Adds AI categorization + deeper insight grounded in real cost_ledger data, fulfilling "mini opportunities for ai categorization/deeper insight" with genuine AI rather than static COA mapping. SPEC-compliant: Workers-AI @cf/* models are T0, so this NEVER uses an LLM above T0. Design — SQL owns every number, the model owns only prose: - queryInsightsAggregates() computes all figures in SQL (today/all-time spend, per-service+tier+provider, 7-day daily trend, top models by cost and by call volume, workers-ai vs external-provider split). - runInsightsModel() feeds those finished figures to @cf/meta/llama-3.1-8b-instruct ONCE (not per-row) and asks for narrative-only fields: per-service category + characterization, cost drivers, trend/anomaly notes, 2-4 grounded recs. The prompt forbids inventing/restating costs and editorializing magnitude. - Numeric fields in the response come straight from the queries; only the prose comes from the model — so "grounded, no fabrication" is structural, and the figure cross-check trivially holds. - response: {generated_at, window, totals, per_service[], drivers[], trends[], recommendations[], daily_trend[], top_models_by_cost[], top_models_by_calls[], model_used}. Parse failures surface raw model text (no fabricated fallback). - Cached ~6h in KV_STATE (insights:{chicago-date}); ?refresh=1 bypasses. Never runs on the 5-min poll (avoids meta-cost). - Empty-state: zero rows in window returns a clear empty result, skips the model. wrangler.toml: adds [ai] binding = "AI" (free, on-account, no new secret). Verified live at comptroller.chitty.cc/api/v1/insights — figures match a direct Neon query exactly (today $0.077455/2055 calls, all-time $0.238430/4471 rows, qwen3-embedding $0.069217). Model correctly characterized chittycounsel as an embedding-heavy workload and flagged the real 6/8→6/10 cost ramp. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

github-actions · 2026-06-11T01:27:51Z

@coderabbitai review
@copilot review
Adversarial review request: evaluate security, policy bypass paths, and regression risk.

coderabbitai · 2026-06-11T01:27:52Z

Warning

Review limit reached

@chitcommit, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 6 minutes and 50 seconds. Learn how PR review limits work.

Your organization has run out of usage credits. Purchase more in the billing tab.

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: add8a851-c4c0-426e-bf90-b962127c1ad6

📥 Commits

Reviewing files that changed from the base of the PR and between 7163222 and 7e3011d.

⛔ Files ignored due to path filters (1)

services/comptroller/pnpm-lock.yaml is excluded by !**/pnpm-lock.yaml

📒 Files selected for processing (5)

services/comptroller/node-async-hooks.d.ts
services/comptroller/package.json
services/comptroller/tsconfig.json
services/comptroller/worker.ts
services/comptroller/wrangler.toml

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/comptroller-phase-a-cf-ai-gateway-ingest

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Copilot

Pull request overview

Adds a new /api/v1/insights endpoint to the Comptroller Worker that computes cost/usage aggregates in SQL and uses Workers AI (T0) to generate narrative-only insight, while also introducing a new Hyperdrive write path and AI Gateway log ingestion into chittyops.cost_ledger.

Changes:

Adds Workers AI binding + /api/v1/insights endpoint with KV caching and narrative-only model output.
Refactors Neon access to use per-invocation postgres clients (AsyncLocalStorage + Hyperdrive connection strings), plus adds optional writer binding and ingestion/insert paths.
Introduces a standalone TypeScript package setup for the service (tsconfig, package.json, pnpm lock).

Reviewed changes

Copilot reviewed 4 out of 6 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
services/comptroller/wrangler.toml	Enables `nodejs_compat`, adds AI binding + Hyperdrive writer binding, adjusts routes/triggers.
services/comptroller/worker.ts	Adds per-invocation DB scoping, AI Gateway ingestion + inserts, and `/api/v1/insights` implementation.
services/comptroller/tsconfig.json	Adds strict TS config for the Worker package.
services/comptroller/package.json	Adds service-local dependencies/scripts (wrangler/tsc/postgres).
services/comptroller/pnpm-lock.yaml	Locks service-local dependency graph.
services/comptroller/node-async-hooks.d.ts	Adds minimal ambient typing for `AsyncLocalStorage` under `nodejs_compat`.

Files not reviewed (1)

services/comptroller/pnpm-lock.yaml: Language not supported

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

 routes = [
-  { pattern = "comptroller.chitty.cc/*", custom_domain = true }
+  { pattern = "comptroller.chitty.cc", custom_domain = true }
 ]


+      const rows = fresh.map((l) => ({
+        service: gw,
+        tier: tierFromModel(l.model),
+        provider: l.provider ?? "unknown",
+        model: l.model ?? "unknown",
+        tokens_in: Math.round(l.tokens_in ?? 0),
+        tokens_out: Math.round(l.tokens_out ?? 0),
+        cached_tokens_in: Math.round(l.usage_metadata?.input_cached_tokens ?? 0),
+        cost_usd: Number(l.cost ?? 0),
+        latency_ms: Math.round(l.timings?.latency ?? 0),
+        item_id_hash: l.id,
+        run_id: null as string | null,
+        fallback_chain: null as string[] | null,
+        ts: l.created_at,
+        cost_constrained: false,
+      }));


+  const perService = (await db`
+    SELECT service,
+           coalesce(sum(cost_usd),0)::float8 AS cost_usd,
+           count(*)::int AS calls,
+           coalesce(sum(tokens_in),0)::bigint AS tokens_in,
+           coalesce(sum(tokens_out),0)::bigint AS tokens_out,
+           (array_agg(provider ORDER BY cost_usd DESC NULLS LAST))[1] AS top_provider,
+           (array_agg(tier ORDER BY cost_usd DESC NULLS LAST))[1] AS top_tier
+    FROM chittyops.cost_ledger
+    WHERE ts >= date_trunc('day', now() AT TIME ZONE 'America/Chicago')
+    GROUP BY service
+    ORDER BY cost_usd DESC
+  `) as any[];


+  const modelsByCost = (await db`
+    SELECT model, (array_agg(provider))[1] AS provider,
+           coalesce(sum(cost_usd),0)::float8 AS cost_usd, count(*)::int AS calls
+    FROM chittyops.cost_ledger
+    WHERE ts >= date_trunc('day', now() AT TIME ZONE 'America/Chicago') - interval '6 days'
+    GROUP BY model
+    ORDER BY cost_usd DESC
+    LIMIT 5
+  `) as any[];


+  const modelsByCalls = (await db`
+    SELECT model, (array_agg(provider))[1] AS provider,
+           coalesce(sum(cost_usd),0)::float8 AS cost_usd, count(*)::int AS calls
+    FROM chittyops.cost_ledger
+    WHERE ts >= date_trunc('day', now() AT TIME ZONE 'America/Chicago') - interval '6 days'
+    GROUP BY model
+    ORDER BY calls DESC
+    LIMIT 5
+  `) as any[];


+ * Data layer:
+ *   - READ  : env.NEON_COMPTROLLER (Hyperdrive, comptroller_reader, read-only) → getDb(env)
+ *   - WRITE : env.NEON_COMPTROLLER_WRITER (Hyperdrive over RW role)            → getWriteDb(env)
+ *             Both Hyperdrive bindings expose a `.connectionString`; we drive them with
+ *             porsager `postgres` (works on Workers over Hyperdrive's TCP socket).
+ *             getWriteDb() FAILS CLOSED if the writer binding is absent (Phase-A blocker).
 */


chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7e3011d873

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-11T01:33:33Z

+  if (maxSeen > hwmMs) {
+    await env.KV_STATE.put(hwmKey, new Date(maxSeen).toISOString());


Do not advance the high-water mark past unprocessed pages

When a gateway has more than MAX_PAGES_PER_GATEWAY * LOGS_PER_PAGE new log rows, this loop stops after the page cap but still stores maxSeen from the newest fetched page. On the next poll, fresh only keeps rows with created_at > hwmMs, so every older log that was beyond the page cap is skipped permanently, undercounting cost_ledger after initial deploys, downtime, or high-volume bursts. Only advance the high-water mark once the pagination has reached the old high-water mark/end, or keep a cursor that does not discard the unprocessed tail.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-11T01:33:34Z

+  const enc = new TextEncoder();
+  const a = enc.encode(presented);
+  const b = enc.encode(key);
+  if (a.byteLength !== b.byteLength) return false;


Avoid returning before constant-time comparison

For _admin/poll, any request with a bearer token length different from COMPTROLLER_HMAC_KEY returns before timingSafeEqual, so an attacker who can repeatedly time requests can learn the secret length even though this helper is documented as constant-time. Cloudflare’s own timingSafeEqual guidance warns not to return early on length mismatch; compare a same-length dummy/self buffer and then fail instead.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-11T01:33:34Z

+  const baseline = await env.KV_STATE.get(BASELINE_LEARNING_KEY);
+  if (!baseline) {
+    const until = new Date(Date.now() + BASELINE_LEARNING_DAYS * 24 * 3600 * 1000).toISOString();
+    await env.KV_STATE.put(BASELINE_LEARNING_KEY, until);


Preserve manual baseline-learning end

When an operator uses /_admin/baseline_learning/end, it deletes this KV key, but the next scheduled run, status request, or admin poll calls ensureColdStartState() and treats the missing key as a fresh deploy by writing a new 14-day window. Since pollMetrics() suppresses L2/L3 while isBaselineLearningActive() is true, the documented manual override is immediately undone and Comptroller remains L1-only; store an initialized/disabled marker or set an expired timestamp instead of recreating after deletion.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-11T01:33:34Z

+  const anomalyCount = (await db`
+    SELECT count(*)::int AS n FROM chittyops.anomalies
+    WHERE detected_at >= date_trunc('day', now() AT TIME ZONE 'America/Chicago')
+  `) as any[];


Add storage for the new anomalies query

I searched repo-wide for CREATE TABLE definitions for anomalies and only found references; the migrations create cost_ledger and pause_exemptions, but not chittyops.anomalies. On a fresh deploy following the checked-in migrations/runbook, /reports/daily now executes this unguarded query and returns 500, which also breaks the AGENTS deploy verification step for /reports/daily; add the table migration or fail-soft here like listAnomalies() does.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-11T01:33:34Z

+      await writeDb`
+        INSERT INTO chittyops.cost_ledger ${writeDb(


Make gateway log inserts idempotent

If the 5-minute cron overlaps with /_admin/poll or a previous long-running poll, both invocations can read the same KV high-water mark before either stores the new value and then execute this insert for the same Cloudflare log IDs. The cost ledger migration has no uniqueness constraint on item_id_hash, so those races double-count calls and cost; use a per-gateway lock/cursor or a unique key with ON CONFLICT DO NOTHING before enabling concurrent manual and scheduled ingest.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-11T01:33:34Z

+    console.error("[isServiceExempt] query failed:", e);
+    return false; // fail-safe: do not block on lookup failure for L2/L3 gate caller


Fail closed when exemption lookup fails

If the exemption query fails, this returns false, and emitL3Signal() treats a protected service as non-exempt and posts to its pause endpoint without SMS confirmation. This is likely in the checked-in setup because I found no migration granting comptroller_reader access to chittyops.pause_exemptions; the previous uncaught error would have aborted the pause path, while this catch turns permission/table outages into fail-open behavior against the AGENTS hard rule to never pause exempt services without explicit SMS confirm.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-11T01:33:34Z

+        tokens_out: Math.round(l.tokens_out ?? 0),
+        cached_tokens_in: Math.round(l.usage_metadata?.input_cached_tokens ?? 0),
+        cost_usd: Number(l.cost ?? 0),
+        latency_ms: Math.round(l.timings?.latency ?? 0),


Read latency from the gateway log duration

Cloudflare's List Gateway Logs response documents duration on each LogListResponse and does not include a timings.latency object, so every ingested row from this endpoint will store latency_ms as 0. That makes the materialized view's average and p95 latency fields unusable for dashboards/anomaly analysis even though the source API provides the value; map the documented duration field instead.

Useful? React with 👍 / 👎.

chitcommit · 2026-06-15T05:10:57Z

@claude resolve conflicts

chitcommit and others added 3 commits June 11, 2026 00:14

Copilot AI review requested due to automatic review settings June 11, 2026 01:27

Copilot started reviewing on behalf of chitcommit June 11, 2026 01:27 View session

Copilot AI reviewed Jun 11, 2026

View reviewed changes

chatgpt-codex-connector Bot reviewed Jun 11, 2026

View reviewed changes

chitcommit enabled auto-merge (squash) June 15, 2026 05:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(comptroller): real Workers-AI (T0) cost insights at GET /api/v1/insights#83

feat(comptroller): real Workers-AI (T0) cost insights at GET /api/v1/insights#83
chitcommit wants to merge 3 commits into
mainfrom
feat/comptroller-phase-a-cf-ai-gateway-ingest

chitcommit commented Jun 11, 2026

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026

Review limit reached

Uh oh!

Copilot AI left a comment

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Uh oh!

chitcommit commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if (maxSeen > hwmMs) {
		await env.KV_STATE.put(hwmKey, new Date(maxSeen).toISOString());

		console.error("[isServiceExempt] query failed:", e);
		return false; // fail-safe: do not block on lookup failure for L2/L3 gate caller

Conversation

chitcommit commented Jun 11, 2026

What

Endpoint

Design — SQL owns every number, the model owns only prose

Response shape

wrangler.toml

Live verification

Uh oh!

github-actions Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026

Review limit reached

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 11, 2026

Choose a reason for hiding this comment

Uh oh!

chitcommit commented Jun 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants