feat(vendor-charge): ingest ChittyScrape portal charges into the books#134
feat(vendor-charge): ingest ChittyScrape portal charges into the books#134chitcommit wants to merge 1 commit into
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review Please evaluate:
|
Code Review — feat(vendor-charge): ingest ChittyScrape portal charges into the booksOverall the wiring is clean and well-motivated. The Zod validation, idempotent upsert pattern, and COA-justification comments are all solid. A few issues need addressing before merge, one of them security-critical. 🔴 Critical — Endpoint is unauthenticated
// server/app.ts — protectedPrefixes does NOT include '/api/vendor-charge'
const protectedPrefixes = [
'/api/accounts', '/api/transactions', ...
// '/api/vendor-charge' is missing
];In practice Fix: add 🟠 Bug —
|
Canonical "ChittyScrape feeds the cost flow" wiring. ChittyScrape (scrape.chitty.cc) extracts vendor charges from portals with no API (registered-agent fees, utilities, mortgage). This adds a real ChittyFinance ingest that turns a ChittyScrape result envelope + resolved charge into an idempotent expense transaction, reusing the PR #133 comptroller pattern. - storage.upsertVendorCharge: idempotent by external_id `scrape:{portalId}:{period}:{vendor}`, type='expense', enforced by the partial unique index transactions_tenant_external_idx (external_id NOT NULL). amount is REQUIRED (> 0) and decimal(12,2); exact USD + paymentStatus + category preserved in metadata. onConflictDoUpdate targetWhere matches the partial arbiter. - POST /api/vendor-charge/ingest: validates the envelope + charge (zod), rejects success=false (422) and a missing/zero amount (400, never defaults to 0), maps vendor category -> real COA code, books against the account by external_id (default chittyos-infra), audit-logs via ledgerLog. - COA mapping (verified against seeded COA on Neon solitary-rice-14149088, no invented codes): registered-agent -> 5050 Legal & Professional Fees; utilities -> 5100/5110/5120/5130/5140; mortgage -> 5300 Mortgage Interest. nw-registered-agent mapping: scraper annualFeeUsd -> charge.amountUsd, paymentStatus ('ok'|'failed') -> charge.paymentStatus, portal 'nw-registered-agent' -> category 'registered-agent' -> COA 5050, period = filing year. Verified end-to-end on real Neon (solitary-rice-14149088): upserted a real Northwest Registered Agent $125/yr charge via the exact storage SQL, selected it back (COA 5050, type expense), re-ran the upsert — same row id, 1 row, no duplicate (idempotency proven). npm run check clean. NOTE: the live portal scrape is credential-gated (NW login via ChittyConnect); the registration + ingest wiring + mapping are real and verified, but the live scrape -> ingest hop is the chico/ChittyConnect follow-up. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 993a28dea4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| app.route('/', classificationRoutes); | ||
| app.route('/', emailRoutes); | ||
| app.route('/', comptrollerRoutes); | ||
| app.route('/', vendorChargeRoutes); |
There was a problem hiding this comment.
Add vendor-charge to protected API middleware
Mounting this route without adding /api/vendor-charge to protectedPrefixes leaves POST /api/vendor-charge/ingest outside the auth/tenant/storage middleware that the handler assumes. I checked server/app.ts: only the prefixes in lines 102-107 get storageMiddleware, hybridAuth, and tenantMiddleware, so a valid ingest request reaches storage.getAccountByExternalId(...) with storage/tenantId unset and fails instead of booking the charge; it also bypasses the intended service-token/tenant checks for a route that writes transactions.
Useful? React with 👍 / 👎.
| classifiedAt: now, | ||
| metadata, | ||
| }) | ||
| .onConflictDoUpdate({ |
There was a problem hiding this comment.
Prevent scraper upserts from changing reconciled rows
When the same scrape:{portal}:{period}:{vendor} charge already exists, this conflict branch updates the transaction unconditionally, including amount, coaCode, and metadata. If the finance team reconciles that transaction and the scrape cron later replays or extracts a corrected amount/category, the locked row is mutated without the reconciled-row guard used by trust-path updates, corrupting a reconciled period.
Useful? React with 👍 / 👎.
| ); | ||
| } | ||
|
|
||
| const category = charge.category ?? PORTAL_DEFAULT_CATEGORY[envelope.portal]; |
There was a problem hiding this comment.
Do not trust the scraper-supplied category as authoritative
Because charge.category takes precedence over the portal default and is immediately mapped into an authoritative coaCode, any caller that reaches this ingest can book a known portal under a different COA (for example, posting category: 'mortgage' for nw-registered-agent). Even with the route protected, this delegates classification authority to the scrape payload rather than validating the category against the portal or requiring an executor role, so a bad scrape result can misclassify tax/reporting data.
Useful? React with 👍 / 👎.
|
|
||
| const [row] = await this.db | ||
| .insert(schema.transactions) | ||
| .values({ |
There was a problem hiding this comment.
Preserve property attribution for portal expenses
The new ingest path supports utilities and mortgage statements, but the inserted transaction never carries a propertyId (and the request schema has no way to pass one). Property P&L is built from getPropertyTransactions(propertyId, ...), and Schedule E puts no-property transactions into entity-level handling, so scraped property-specific bills will be missing from the property's financials/tax columns even though they were booked.
Useful? React with 👍 / 👎.
993a28d to
456ef72
Compare
|
@coderabbitai review Please evaluate:
|
Code Review — feat(vendor-charge): ingest ChittyScrape portal charges into the booksOverall this is a clean, well-scoped PR. The idempotency approach, Zod validation, and tenant-scoping pattern all follow established project conventions. A few issues to address before merge. Bugs / Correctness
utility: '5140', // generic utility fallback (internet/cable) when unspecifiedA caller passing // Remove the 'utility' fallback entry entirely.
// If the portal default or explicit category is 'utility', the unknown_category
// guard below already returns 400 with the full list of known codes.If a true parent-level "utility catch-all" COA code exists in the chart, add it with the right code and a clear comment; otherwise let the 400 surface the problem. ExternalId uses vendor name verbatim — vendor-name drift creates duplicate rows const externalId = `scrape:${input.portalId}:${input.period}:${input.vendor}`;If the scraper emits const externalId = `scrape:${input.portalId}:${input.period}`;Worth an explicit decision here — at minimum document the normalisation contract. Upsert does not update The SecurityNo explicit auth contract for ChittyCommand → this endpoint The PR description says the route is "driven by whatever dispatches the scrape (ChittyCommand cron)". The route sits behind Code Quality / ConventionsComment density exceeds CLAUDE.md guidance CLAUDE.md: "Only add [a comment] when the WHY is non-obvious." The top-of-file architectural narrative and the multi-paragraph storage docblock mostly duplicate the PR description. The partial-index
COA comments in The inline comments on each COA code (explaining why 5050 vs 6040, etc.) are the right kind of non-obvious context — keep those. Performance / Minor
Test CoverageRoute handler + zod branches are typecheck-only (acknowledged). Before this ships to production, the following cases should have at minimum a happy-path integration test:
Dependency NoteThis is stacked on unmerged #133. The Summary
|
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 456ef7250b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| ); | ||
| } | ||
|
|
||
| const date = charge.date ? new Date(charge.date) : new Date(envelope.scrapedAt); |
There was a problem hiding this comment.
Require a period date before booking scraped charges
When charge.date is omitted, this records the transaction on envelope.scrapedAt even though charge.period is the billing period. A late or backfilled scrape such as period: "2025" scraped on 2026-01-15 will be filtered into the 2026 tax/reporting window because reports use transactions.date rather than the metadata period, so the expense is omitted from the intended year. Require charge.date for periodized bills or derive the transaction date from charge.period before inserting.
Useful? React with 👍 / 👎.
| paymentStatus?: string; // scraped 'ok' | 'failed' — recorded, does not gate the expense | ||
| metadata?: Record<string, unknown>; | ||
| }) { | ||
| const externalId = `scrape:${input.portalId}:${input.period}:${input.vendor}`; |
There was a problem hiding this comment.
Include account identity in the scrape idempotency key
For portals that can have more than one bill in the same period, such as two ComEd utility accounts or two mortgage accounts, both ingests use the same scrape:{portalId}:{period}:{vendor} key even when the caller passes different accountExternalIds. The second charge conflicts with the first and updates its amount/metadata instead of creating a separate transaction, so one account's bill is lost from the books; include an account, meter, or statement identifier in the key for multi-account vendors.
Useful? React with 👍 / 👎.
What
Canonical "ChittyScrape feeds the cost flow" wiring. Turns a ChittyScrape-extracted vendor charge (registered-agent fee, utility bill, mortgage statement) into an idempotent ChittyFinance expense transaction, reusing the PR #133 comptroller pattern.
Changes
storage.upsertVendorCharge— idempotent byexternal_id = scrape:{portalId}:{period}:{vendor},type='expense', partial-index arbiter.amountREQUIRED (>0), never defaults to 0; exact USD + paymentStatus + category preserved inmetadata.POST /api/vendor-charge/ingest(server/routes/vendor-charge.ts) — zod-validates envelope+charge; rejectssuccess=false(422) and missing/zero amount (400); maps vendor category → real COA; books against account byexternal_id(defaultchittyos-infra); audit-logs vialedgerLog.COA mapping (real codes, verified on Neon
solitary-rice-14149088— no invented codes)nw-registered-agent → ingest field mapping
NWAgentResult)annualFeeUsdcharge.amountUsd(required)paymentStatus'ok'|'failed'charge.paymentStatus(recorded, does not gate the expense)nw-registered-agentregistered-agent→ COA5050charge.periodenvelope.scrapedAtcharge.datefallback(Scraper side —
annualFeeUsdextraction — is chittyos/chittyscrape#21.)Verification — what was actually exercised
solitary-rice-14149088). Ran the exactupsertVendorChargeINSERT…ON CONFLICT SQL with a real Northwest Registered Agent $125/yr charge → row created (COA5050,type=expense); re-ran → same row id, 1 row, no duplicate (idempotency proven). The verification row was then deleted — it was a fixture, not a real NW bill, and must not persist in IT CAN BE LLC's books.npm run checkclean.Credential-gated follow-up (chico/ChittyConnect)
The live portal scrape (NW login) is credential-gated. Registration + ingest wiring + mapping are real and verified per above; the live scrape→ingest hop needs ChittyConnect creds.
🤖 Generated with Claude Code