Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions src/scrapers/nw-registered-agent.ts
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,39 @@ export interface NWAgentResult {
mailForwardingStatus?: string;
paymentStatus?: string;
alerts: string[];
/**
* Annual registered-agent fee in USD, parsed from the account/billing page
* when the portal surfaces it. Undefined when no fee is shown (e.g. the
* inbox view carries no billing total) — consumers that book a cost MUST
* treat absence as "amount unknown", never as $0. This is the monetary
* field ChittyFinance's vendor-charge ingest consumes.
*/
annualFeeUsd?: number;
}

/**
* Parse a registered-agent annual fee from portal page text.
*
* Northwest's billing/account pages render the renewal cost near phrases like
* "Registered Agent", "Annual Fee", "Renewal", or "Service Fee" followed by a
* dollar amount. We scan for those anchors and return the nearest USD figure.
* Returns undefined when nothing matches — the caller must not default to 0.
*/
export function parseAnnualFee(text: string): number | undefined {
if (!text) return undefined;
const anchors = /(registered\s+agent|annual\s+fee|renewal|service\s+fee|yearly)/i;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Require the renewal amount to be for registered-agent service

Because renewal by itself is treated as a fee anchor, any dashboard line such as an annual-report renewal or other renewal notice with a $ amount will be returned as annualFeeUsd before a later registered-agent charge is considered. In accounts with annual report reminders or multiple renewal notices, this can book the wrong vendor-charge amount; the match needs to be scoped to registered-agent/service-fee billing text instead of any renewal line.

Useful? React with 👍 / 👎.

// $amount allowing thousands separators and optional cents.
const money = /\$\s?([0-9]{1,3}(?:,[0-9]{3})*(?:\.[0-9]{2})?|[0-9]+(?:\.[0-9]{2})?)/;
for (const rawLine of text.split(/\n|\.|;/)) {
Comment on lines +43 to +44

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Parse full fee amounts before returning

When the portal shows an uncommaed four-digit fee or a fee with cents, this parser can emit a smaller amount than shown: the first regex alternative matches only the prefix of $1000 as $100, and text.split(/\n|\.|;/) splits $125.50 before the cents are parsed. Because annualFeeUsd is consumed as the booked cost amount, these common currency formats would silently understate the charge instead of returning the portal value.

Useful? React with 👍 / 👎.

const line = rawLine.trim();
if (!anchors.test(line)) continue;
const m = line.match(money);
Comment on lines +44 to +47

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Match fee labels split from their amounts

This only checks for a dollar amount on the same split segment as the anchor, so common billing layouts such as a table/card with Annual Fee in one cell or line and $125 in the adjacent value cell/line return undefined. Since the scraper now exposes annualFeeUsd for the downstream cost flow, those label/value layouts will look like an unknown fee even though the portal surfaced the amount; the parser should inspect nearby text after an anchor rather than only the current line.

Useful? React with 👍 / 👎.

if (m) {
const val = parseFloat(m[1].replace(/,/g, ''));
if (Number.isFinite(val) && val > 0) return val;
}
}
return undefined;
}

/**
Expand Down Expand Up @@ -294,6 +327,10 @@ async function scrapeNWRegisteredAgent(

const alerts = accountData?.alerts || [];

// Best-effort parse of the annual fee from the account page text. Undefined
// when the portal view shows no fee — never coerced to 0.
const annualFeeUsd = parseAnnualFee(accountData?.bodySnippet || '');

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Avoid truncating the text used for fee extraction

This calls the fee parser on bodySnippet, but accountData only stores bodyText.slice(0, 2000). On a logged-in dashboard where navigation, banners, or entity lists push the billing/renewal section past the first 2000 characters, the new annualFeeUsd field will be omitted even though the page contains the fee, causing the downstream cost flow to see an unknown amount. Use the full page text, or a billing-specific extraction, for the new monetary field instead of the display snippet.

Useful? React with 👍 / 👎.


return {
success: true,
data: {
Expand All @@ -302,6 +339,7 @@ async function scrapeNWRegisteredAgent(
documents,
paymentStatus: alerts.includes('PAYMENT_FAILED') ? 'failed' : 'ok',
alerts,
annualFeeUsd,
},
};
} catch (err: any) {
Expand Down