diff --git a/.claude/commands/bring-up.md b/.claude/commands/bring-up.md index cd034ab..92b0748 100644 --- a/.claude/commands/bring-up.md +++ b/.claude/commands/bring-up.md @@ -9,9 +9,12 @@ Guide me through bring-up for a store, in order, pausing for me between steps 2. `roomieorder doctor` — confirm profile / display / chrome are green. 3. `roomieorder verify-selectors --provider ` — confirm the price and add-to-cart selectors match; fix any MISS off the dom dump first. -4. `roomieorder dry-run --provider ` — confirm it reaches the +4. `roomieorder trace-order --provider ` — walk the whole flow and + confirm the cart/checkout/review selectors (incl. `place-order`/`order-total`/ + payment) resolve; fix any MISS off that step's dom dump before the first order. +5. `roomieorder dry-run --provider ` — confirm it reaches the review page; `Read` the screenshot. -5. Only after a clean dry-run on a cheap item: flip `DRY_RUN=false` and place +6. Only after a clean dry-run on a cheap item: flip `DRY_RUN=false` and place one real order, then `roomieorder queue` to confirm `placed`. Never flip `DRY_RUN` or place a real order without my explicit go-ahead. diff --git a/.claude/commands/trace-order.md b/.claude/commands/trace-order.md new file mode 100644 index 0000000..c81c020 --- /dev/null +++ b/.claude/commands/trace-order.md @@ -0,0 +1,19 @@ +--- +description: Dump DOM + selector probe + screenshot at every checkpoint of the buy flow. +argument-hint: " [--provider costco|amazon]" +--- +Run `roomieorder trace-order $ARGUMENTS`. + +This always forces DRY_RUN — it walks the real buy flow to the review page and +NEVER places an order. At each checkpoint (product → cart → cart view → delivery +→ payment → review) it writes a rendered `*_dom.html`, a selector `*_probe.txt`, +and a screenshot to the shots dir, and prints a per-step PASS/MISS digest. + +Unlike `dump-dom`/`verify-selectors` (which stop at the product page), this +reaches the checkout/review surface where the `place-order`, `order-total`, and +payment selectors finally render. For any group still MISS at a checkout step, +`Read` that step's `*_dom.html` and find the real selector on the live page (per +AGENTS.md §1), then propose the corrected selector(s) for `purchase.py`. + +Do NOT edit `purchase.py` unless I explicitly ask — the buy flow is +additive-only and can only be validated against live DOM during bring-up. diff --git a/.claude/commands/triage-failure.md b/.claude/commands/triage-failure.md index 9547a73..e68e133 100644 --- a/.claude/commands/triage-failure.md +++ b/.claude/commands/triage-failure.md @@ -7,6 +7,8 @@ newest screenshot (and `*_dom.html` / `*_probe.txt` if present). Classify the failure using AGENTS.md §1–§3: selector drift, logged-out / sign-in wall, CAPTCHA/OTP challenge, or an outright Akamai block. State which stage died (from the shot tag) and recommend exactly one next command — e.g. -`dump-dom`, `verify-selectors`, `login`, or `resume`. +`dump-dom`, `verify-selectors`, `login`, or `resume`. For a checkout-stage death +(`no_place_order`, `left_checkout`, `cart_mismatch`), recommend `trace-order`: +it's the only tool that dumps the cart/review DOM where those selectors live. Do not order or log in yourself; just diagnose and recommend. diff --git a/AGENTS.md b/AGENTS.md index d1b8833..94113c4 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -9,13 +9,14 @@ before touching the buy flow, the catalog, or login/bot-detection logic. Start every "why did X break" investigation with the read-only diagnostics below — they're safe (no browser, no spend) and tell you where to look. The `.claude/commands/` slash commands (`/diagnose`, `/triage-failure`, -`/verify-selectors`, `/bring-up`) chain these for you. +`/verify-selectors`, `/trace-order`, `/bring-up`) chain these for you. | Symptom | First command | Then read | | --- | --- | --- | | "is anything wrong?" / cold start | `roomieorder doctor` | its own output (config, Chrome, display, profiles, DB, catalog) | | "the order didn't place" | `roomieorder failures` | the newest `*.png` it lists, plus the row's `notes` | | selector miss / store redesign | `roomieorder verify-selectors [item]` | the `*_dom.html` it points at, then read the live selector off it (§1) | +| checkout/review selector miss (`no_place_order` / `left_checkout` / `cart_mismatch`) | `roomieorder trace-order ` (DRY_RUN walk, never orders) | the `*_checkout_landed_dom.html` it dumps — the only place `place-order`/`order-total`/payment selectors render (§1) | | logged out / sign-in wall | `roomieorder dump-dom ` | §2 — prefilled ≠ logged in; check the **logon URL**, not header text | | CAPTCHA / OTP challenge | (worker auto-pauses) `roomieorder status` | §1, §3 — Akamai may be blocking; this is expected-until-verified | | Sheets row never appeared | `roomieorder test-sheet` | the gspread error (`-v`); a no-op logger silently "succeeds" otherwise | @@ -26,7 +27,10 @@ below — they're safe (no browser, no spend) and tell you where to look. The are read-only and allow-listed in `.claude/settings.json`, so they run without a permission prompt. `verify-selectors` (and `dump-dom`) hit live store pages read-only and need a logged-in profile + network — they're operator-run, not -CI. +CI. `trace-order` is the same footprint but walks the *whole* flow to the review +page (always DRY_RUN — never orders); add +`Bash(roomieorder trace-order:*)` to the settings allow-list to run it without a +prompt. **Queue statuses** (`store.py`, also the Sheets `status` column): `pending` / `in_progress` (transient); `placed` (done); `dry_run`; `skipped_cooldown` / @@ -46,9 +50,11 @@ cart-singleton guard saw more than the intended item — NOT placed) / `signin_* / `challenge_*` / `blocked_*` / `left_checkout` / `submitted_unconfirmed` / `confirmation` / `review` / `timeout` / `crash` / `dump`. Diagnostic tags are captured full-page (below-the-fold banners included); the `review`/`confirmation` -/`dump` shots stay header-only. `verify-selectors` and `dump-dom` also write -`*_dom.html` (rendered page) and `*_probe.txt` (per-selector match counts) — -`Read` those to find the real selector instead of guessing. The shots dir is +/`dump` shots stay header-only. `trace-order` adds a per-step family tagged +`trace{HHMMSS}_{NN}_{step}` (full-page, so the whole cart/review page is caught). +`verify-selectors`, `dump-dom`, and `trace-order` also write `*_dom.html` +(rendered page) and `*_probe.txt` (per-selector match counts) — `Read` those to +find the real selector instead of guessing. The shots dir is pruned automatically (worker) and via `roomieorder prune-shots` (`ROOMIEORDER_SHOTS_RETENTION_DAYS`, default 30). @@ -95,6 +101,19 @@ logged-in profile. (`_PRICE_SELECTORS` already has a structured-data fallback `og:price`/`product:price:amount` meta tags, then JSON-LD `offers.price` — for when the visible-price CSS guesses miss on the `/p/-//` storefront.) +**Reaching the cart/checkout selectors — `roomieorder trace-order `.** +`dump-dom` stops at the product page, so the `place-order`/`order-total`/payment +selector groups always read `count=0` there. `trace-order` (also DRY_RUN, never +orders — it forces `dry_run` and halts at the review page) attaches a +`purchase.FlowTracer` that drops the same DOM + probe + screenshot trio at every +checkpoint of the *real* buy path (`product_loaded` → `cart_added` → `cart_view` +→ `delivery_continue` → `payment_selected` → `checkout_landed` → +`review_pre_place`). To fix a checkout selector miss, run it and `Read` the +`*_checkout_landed_dom.html` — that's the SinglePageCheckoutView where those +selectors live. The same tracer rides live worker orders when +`ROOMIEORDER_TRACE_ORDERS=true` (default off — adds per-step I/O; an advanced +escape hatch for a recurring mid-checkout failure, not a default). + The assistant's own Bash shell on host `link` can reach the graphical session, so headed Playwright (`dump-dom`, `dry-run`, `login`) can be driven directly from Bash against a logged-in profile dir when faster iteration is wanted — but diff --git a/src/roomieorder/cli.py b/src/roomieorder/cli.py index 788938a..c1d0088 100644 --- a/src/roomieorder/cli.py +++ b/src/roomieorder/cli.py @@ -11,6 +11,7 @@ * ``login`` — open the profile headed to sign into Costco by hand. * ``dry-run KEY`` — drive one item to its review page and screenshot, no order. * ``dump-dom KEY`` — read-only DOM dump + selector probe for bring-up. +* ``trace-order KEY`` — DRY_RUN walk dumping DOM + probe + screenshot per step. * ``verify-selectors`` — probe live pages for stale buy-flow selectors. * ``doctor`` — one-shot, read-only health check of every subsystem (``--check-login`` adds a per-store signed-in probe). @@ -304,6 +305,78 @@ def dump_dom(item_key: str, provider: str) -> None: click.echo(result.summary) +# Selector groups worth a one-line PASS/MISS digest per checkpoint in the +# trace-order table — the buy-flow groups, skipping the noisier price-meta/signin. +_DIGEST_GROUPS = ("price", "add-to-cart", "buy-now", "place-order", "order-total") + + +@main.command(name="trace-order") +@click.argument("item_key") +@_PROVIDER_OPT +def trace_order(item_key: str, provider: str) -> None: + """Walk ITEM_KEY through the whole buy flow, dumping every step — never orders. + + Forces DRY_RUN (like ``dry-run``) so it always halts at the review page + *before* Place Order, then attaches a tracer that writes a rendered DOM, a + selector probe, and a screenshot at each checkpoint — product page, cart, + cart view, delivery, payment, and the review page. Unlike ``dump-dom`` (which + stops at the product page), this reaches the checkout/review surface where the + ``place-order``/``order-total``/payment selectors finally render, so they + become discoverable. Hits live store pages, so it's operator-run, not CI. + """ + from roomieorder.purchase import FlowTracer, new_run_id + + config = load_config() + config = config.model_copy(update={"dry_run": True}) + items = load_catalog(config.catalog_path) + item = items.get(item_key) + if item is None: + raise click.ClickException(f"unknown item_key: {item_key} (have: {', '.join(items)})") + source = _source_for(item, provider) + + store = Store(config.db_path) + store.init_db() + purchaser = _purchaser_for(config, provider) + + def proceed_check(live_price: float): # type: ignore[no-untyped-def] + ceiling = check_price_ceiling(item.title, source.price_ceiling, live_price) # type: ignore[attr-defined] + if not ceiling.ok: + return ceiling + return check_spend_cap(store, config, live_price * item.qty) + + tracer = FlowTracer(purchaser, item_key, run_id=new_run_id()) # type: ignore[arg-type] + click.echo(f"trace-order {item_key} ({provider}) → {purchaser._resolve_url(source)}") # type: ignore[attr-defined] + result = purchaser.buy(item_key, item, source, proceed_check, tracer=tracer) # type: ignore[attr-defined] + store.close() + + click.echo(f"status: {result.status}") + click.echo(f"unit_price: {result.unit_price}") + click.echo(f"order_total: {result.order_total}") + click.echo(f"message: {result.message}") + click.echo("") + click.echo(f"steps ({len(tracer.steps)}):") + any_artifact = False + for step in tracer.steps: + hits = _group_hits(step.summary) + digest = " ".join( + f"{g}={'ok' if hits.get(g) else 'MISS'}" + for g in _DIGEST_GROUPS + if g in hits + ) + click.echo(f" {step.idx:02d} {step.name:18} {step.url}") + click.echo(f" {digest}") + if step.probe: + any_artifact = True + click.echo(f" probe: {step.probe}") + if step.html: + click.echo(f" dom: {step.html}") + if step.screenshot: + click.echo(f" shot: {step.screenshot}") + if any_artifact: + click.echo("") + click.echo("For any group still MISS at a checkout step, Read that step's *_dom.html to find the live selector.") + + # Statuses that mean an order didn't cleanly place — what `failures` surfaces. _TROUBLE_STATUSES = ( "failed", diff --git a/src/roomieorder/config.py b/src/roomieorder/config.py index 0044e44..14f65ab 100644 --- a/src/roomieorder/config.py +++ b/src/roomieorder/config.py @@ -128,6 +128,15 @@ class Config(BaseModel): auto_retry: bool = False auto_retry_max: int = Field(default=1, ge=0) + # Opt-in full-flow tracing on *live* worker orders (off by default). When on, + # every buy attaches a purchase.FlowTracer that dumps a DOM + selector probe + + # screenshot at each checkout step into shots_dir — the same artifacts the + # `trace-order` CLI produces, but for real runs, so a mid-checkout failure + # leaves the whole trail. Adds page.content()+screenshot I/O per step, so it's + # an advanced troubleshooting escape hatch, not a default. The pruner covers + # the extra artifacts via shots_retention_days. + trace_orders: bool = False + # Dead-man's-switch heartbeat. The worker pings this URL on a timer; a missed # ping alerts via whatever push-style monitor it points at — hosted # Healthchecks.io or a self-hosted open-source instance, Uptime Kuma push, @@ -215,6 +224,7 @@ def load_config() -> Config: openclaw_channel=_env_str("OPENCLAW_CHANNEL", "telegram"), auto_retry=_env_bool("ROOMIEORDER_AUTO_RETRY", False), auto_retry_max=_env_int("ROOMIEORDER_AUTO_RETRY_MAX", 1), + trace_orders=_env_bool("ROOMIEORDER_TRACE_ORDERS", False), heartbeat_url=_env_str("ROOMIEORDER_HEARTBEAT_URL", ""), heartbeat_interval_seconds=_env_int("ROOMIEORDER_HEARTBEAT_INTERVAL_SECONDS", 300), session_check_hours=_env_float("ROOMIEORDER_SESSION_CHECK_HOURS", 0.0), diff --git a/src/roomieorder/orchestrator.py b/src/roomieorder/orchestrator.py index a0980cc..a1dfdc2 100644 --- a/src/roomieorder/orchestrator.py +++ b/src/roomieorder/orchestrator.py @@ -23,8 +23,10 @@ AmazonPurchaser, BasePurchaser, CostcoPurchaser, + FlowTracer, ProceedCheck, PurchaseResult, + new_run_id, ) from roomieorder.store import Store @@ -104,7 +106,19 @@ def buy(self, item_key: str, item: CatalogItem) -> PurchaseResult: for idx, (name, source, purchaser) in enumerate(chain): is_last = idx == len(chain) - 1 - result = purchaser.buy(item_key, item, source, self._proceed_check(item, source)) + # Opt-in full-flow trace on live orders (config.trace_orders, default + # off). Each store-leg gets its own run_id so a Costco→Amazon fallback + # keeps its two traces apart. Off → the no-op default keeps the buy + # byte-for-byte unchanged. + tracer = ( + FlowTracer(purchaser, item_key, run_id=new_run_id()) + if self.config.trace_orders + else None + ) + kwargs = {"tracer": tracer} if tracer is not None else {} + result = purchaser.buy( + item_key, item, source, self._proceed_check(item, source), **kwargs + ) result.provider = name if result.status in _FALLBACK_STATUSES and not is_last: diff --git a/src/roomieorder/purchase.py b/src/roomieorder/purchase.py index 2a8172f..0e45d7f 100644 --- a/src/roomieorder/purchase.py +++ b/src/roomieorder/purchase.py @@ -30,7 +30,10 @@ ⚠️ Every selector, marker, and order-number regex below is a best-guess against a live DOM nobody here can see. Each DOM-dependent constant is flagged ``# TODO(): verify against live DOM`` and MUST be confirmed during bring-up -(`roomieorder login` / `dry-run` / `dump-dom`). +(`roomieorder login` / `dry-run` / `dump-dom` / `trace-order`). The checkout-only +selectors (place-order / order-total / payment) render past the product page, so +`trace-order` — which dumps a DOM + probe + screenshot at every checkout step — is +the tool that reaches them. """ from __future__ import annotations @@ -38,7 +41,7 @@ import json import logging import re -from dataclasses import dataclass +from dataclasses import dataclass, field from datetime import datetime, timezone from pathlib import Path from typing import TYPE_CHECKING, Any, Callable, Generic, Literal, Optional, TypeVar @@ -110,8 +113,9 @@ def _playwright_api() -> object: # below the fold, which a header-only shot crops out. The happy-path `review` # and `confirmation` shots and the `dump` bring-up shot stay header-only — they # go out over the notifier, where a tall full-page PNG is just bulk. The -# blocked_/challenge_/signin_ families carry a `_{where}` suffix, so they match -# by prefix. +# blocked_/challenge_/signin_ families carry a `_{where}` suffix, and the +# `trace…` family (FlowTracer's per-step `{run_id}_{NN}_{name}` tags) wants the +# whole cart/review page, so they match by prefix. _FULL_PAGE_TAGS = frozenset( { "no_price", @@ -126,7 +130,7 @@ def _playwright_api() -> object: "crash", } ) -_FULL_PAGE_TAG_PREFIXES = ("blocked_", "challenge_", "signin_") +_FULL_PAGE_TAG_PREFIXES = ("blocked_", "challenge_", "signin_", "trace") def _is_full_page_tag(tag: str) -> bool: @@ -174,6 +178,103 @@ class DumpResult: summary: str = "" +@dataclass +class TraceStep: + """One checkpoint captured by a :class:`FlowTracer` mid-:meth:`BasePurchaser.buy`. + + Parallels :class:`DumpResult` but stamped with the checkpoint's ordinal + (``idx``) and ``name`` so a whole order's steps sort and group together.""" + + name: str + idx: int + url: str = "" + logged_in: bool = False + blocked: bool = False + challenge: bool = False + html: Optional[Path] = None + probe: Optional[Path] = None + screenshot: Optional[Path] = None + summary: str = "" + + +class _NullTracer: + """The default, do-nothing tracer threaded through every live ``buy()``. + + A single shared singleton (:data:`_NULL_TRACER`) is the default argument, so + a production worker/orchestrator buy pays only one inert method call per + checkpoint — no ``page.content()``, no probe, no screenshot, no I/O. This is + the property that keeps a money-moving order byte-for-byte identical to today + whether or not tracing exists.""" + + def checkpoint(self, page: "Page", name: str) -> None: # noqa: D102 — no-op + return None + + +_NULL_TRACER = _NullTracer() + + +def new_run_id() -> str: + """A short per-order id (``trace{HHMMSS}``) prefixing one run's trace files. + + Keeps every checkpoint of a single buy grouped and chronologically sortable + in the flat ``shots_dir`` without a subdirectory. The ``trace`` prefix also + routes the step screenshots to a full-page capture (see ``_FULL_PAGE_TAG_PREFIXES``).""" + return "trace" + datetime.now(timezone.utc).strftime("%H%M%S") + + +@dataclass +class FlowTracer: + """Captures DOM + selector probe + screenshot at each buy-flow checkpoint. + + Attached only by the ``trace-order`` CLI command (and the opt-in live trace), + never by a default buy. It rides the *real* :meth:`BasePurchaser.buy` path — + no parallel walk that could drift — so the selectors it probes are exactly + the ones the order would hit. Every checkpoint is best-effort: a capture + failure is logged and swallowed so it can never alter the order's outcome. + + Filenames are tagged ``{run_id}_{NN}_{name}`` so one order's artifacts sort + chronologically and group together in the flat ``shots_dir``.""" + + purchaser: "BasePurchaser[Any]" + item_key: str + run_id: str + steps: list[TraceStep] = field(default_factory=list) + _idx: int = 0 + + def checkpoint(self, page: "Page", name: str) -> None: + self._idx += 1 + idx = self._idx + log = correlated(_logger, provider=self.purchaser.PROVIDER, item=self.item_key) + step = TraceStep(name=name, idx=idx) + tag = f"{self.run_id}_{idx:02d}_{name}" + try: + step.url = page.url + except Exception: # noqa: BLE001 — best-effort + pass + for attr, fn in ( + ("logged_in", self.purchaser.is_logged_in), + ("blocked", self.purchaser._is_blocked), + ("challenge", self.purchaser._is_challenge), + ): + try: + setattr(step, attr, bool(fn(page))) + except Exception: # noqa: BLE001 — best-effort + pass + try: + step.summary = self.purchaser._probe_selectors(page) + except Exception as exc: # noqa: BLE001 — a probe miss must not abort the buy + step.summary = f"probe failed: {exc}" + step.html = self.purchaser._write_text( + self.item_key, f"{tag}_dom", "html", self.purchaser._page_html(page) + ) + step.probe = self.purchaser._write_text( + self.item_key, f"{tag}_probe", "txt", step.summary + ) + step.screenshot = self.purchaser._screenshot(page, self.item_key, tag) + self.steps.append(step) + log.info("trace checkpoint %02d %s → %s", idx, name, step.probe) + + # proceed_check(live_price) -> GuardResult. Lets the worker run price-ceiling # and spend-cap guards (which need the store) without pulling the store into # this module. @@ -342,8 +443,13 @@ def _source_label(self, source: SourceT) -> str: """A short id for log/probe messages, e.g. ``item #1640526``.""" raise NotImplementedError - def _start_checkout(self, page: "Page") -> bool: - """Reach the place-order review page from the product page.""" + def _start_checkout( + self, page: "Page", *, tracer: "_NullTracer | FlowTracer" = _NULL_TRACER + ) -> bool: + """Reach the place-order review page from the product page. + + ``tracer`` (default no-op) checkpoints the store-specific sub-steps — + add-to-cart, cart, payment — so ``trace-order`` captures each section.""" raise NotImplementedError def _reset_cart(self, page: "Page") -> None: @@ -475,6 +581,8 @@ def buy( item: CatalogItem, source: SourceT, proceed_check: ProceedCheck, + *, + tracer: "_NullTracer | FlowTracer" = _NULL_TRACER, ) -> PurchaseResult: """Execute (or dry-run) the buy of ``source`` for ``item``. @@ -482,6 +590,11 @@ def buy( programmer errors, not store flakiness — those become a ``failed`` result with a screenshot. A ``unavailable`` result (sold out / not carried / not found) signals the orchestrator to try the other store. + + ``tracer`` defaults to a shared no-op (:data:`_NULL_TRACER`): a normal + worker/orchestrator buy pays only one inert call per checkpoint. Pass a + :class:`FlowTracer` (the ``trace-order`` CLI does, forcing DRY_RUN) to + dump a DOM + selector probe + screenshot at every step of the way. """ api = _playwright_api() PWTimeout = api.TimeoutError # type: ignore[attr-defined] @@ -561,6 +674,7 @@ def buy( # Check before the price read: a 404 / sold-out page may carry no # price, and we want `unavailable` (fall back), not `failed`. self._settle(page) + tracer.checkpoint(page, "product_loaded") reason = self._check_availability(page, http_status) if reason is not None: shot = self._screenshot(page, item_key, "unavailable") @@ -597,11 +711,13 @@ def buy( screenshot=shot, ) + tracer.checkpoint(page, "price_read") + # ── reach the review page ── # From here we drive add-to-cart, so a later failure is no longer # money-safe to auto-retry. cart_touched = True - if not self._start_checkout(page): + if not self._start_checkout(page, tracer=tracer): # A failure here means we never confirmed the review page. An # Akamai block or a sign-in bounce mid-drive lands here too, so # classify those first (worker pauses) instead of mislabelling @@ -663,6 +779,9 @@ def buy( # bounded window the landing check uses before reading. self._wait_for_any(page, self.ORDER_TOTAL_SELECTORS, timeout=self._LANDING_TIMEOUT_MS) review_total = self._read_total(page) + # The review page is where the place-order/order-total/payment + # selector groups finally render — the surface dump-dom can't reach. + tracer.checkpoint(page, "checkout_landed") # ── hard cart-contents guard (⚠️ real money) ── # A live Place Order checks out the *whole* cart, and _reset_cart @@ -680,6 +799,7 @@ def buy( # ── DRY_RUN stops here ── if self.config.dry_run: + tracer.checkpoint(page, "review_pre_place") shot = self._screenshot(page, item_key, "review") msg = f"[DRY] would order {item_key} at ${price:.2f}" if review_total is not None: @@ -1689,7 +1809,9 @@ def ensure_logged_in(self, page: "Page") -> bool: self._settle(page) return self.is_logged_in(page) - def _start_checkout(self, page: "Page") -> bool: + def _start_checkout( + self, page: "Page", *, tracer: "_NullTracer | FlowTracer" = _NULL_TRACER + ) -> bool: """Add to cart → go to cart → checkout → select payment → review. Costco has no one-click Buy Now: the flow is add-to-cart, then the cart, @@ -1708,6 +1830,7 @@ def _start_checkout(self, page: "Page") -> bool: return False page.wait_for_load_state("domcontentloaded") self._settle(page) + tracer.checkpoint(page, "cart_added") # ── go to cart → checkout ── # Prefer the flyout's Checkout CTA; if that doesn't land us on the @@ -1722,6 +1845,7 @@ def _start_checkout(self, page: "Page") -> bool: wait_until="domcontentloaded", ) self._settle(page) + tracer.checkpoint(page, "cart_view") self._click_by_role(page, ("button", "link"), "checkout") self._settle(page) if not self._on_checkout(page): @@ -1731,11 +1855,13 @@ def _start_checkout(self, page: "Page") -> bool: # TODO(costco): verify against live DOM — does delivery need a click? self._click_by_role(page, ("button", "link"), "continue") self._settle(page) + tracer.checkpoint(page, "delivery_continue") # ── select the saved default payment method ── # Place Order is inert until a payment method is chosen, and the saved # card isn't reliably pre-selected, so click its radio here. self._select_payment_method(page) + tracer.checkpoint(page, "payment_selected") return True def _on_checkout(self, page: "Page") -> bool: @@ -2083,24 +2209,30 @@ def _login_init_script(self) -> Optional[str]: """ ) - def _start_checkout(self, page: "Page") -> bool: + def _start_checkout( + self, page: "Page", *, tracer: "_NullTracer | FlowTracer" = _NULL_TRACER + ) -> bool: """Click Buy Now; fall back to Add to Cart → Proceed to checkout. TODO(amazon): verify against live DOM — every step below. """ if self._click_first(page, self.BUY_NOW_SELECTORS): + tracer.checkpoint(page, "buy_now") return True if not self._click_first(page, self.ADD_TO_CART_SELECTORS): return False # Cart interstitial → checkout. page.wait_for_load_state("domcontentloaded") + tracer.checkpoint(page, "cart_added") for sel in ("#sc-buy-box-ptc-button", "input[name='proceedToRetailCheckout']"): if self._click_first(page, (sel,)): + tracer.checkpoint(page, "proceed_checkout") return True # Some flows expose a role-named link instead. try: page.get_by_role( "link", name=re.compile("proceed to checkout", re.I) ).first.click(timeout=5_000) + tracer.checkpoint(page, "proceed_checkout") return True except Exception: # noqa: BLE001 return False diff --git a/tests/fixtures/dom/README.md b/tests/fixtures/dom/README.md index 014992e..8aaad58 100644 --- a/tests/fixtures/dom/README.md +++ b/tests/fixtures/dom/README.md @@ -60,11 +60,13 @@ clean. Keep the resulting file ≤ ~300 KB. - **`costco_product_out_of_stock.html`** — no delivery-unavailable PDP has been captured (every dumped item was in stock). The `_check_availability` test skips until one is dropped in. -- **`costco_checkout_review.html`** — `dump-dom` stops at the product page and - never proceeds to checkout, so no SinglePageCheckoutView HTML exists to - sanitize (only `*_review.png` screenshots do). The `PLACE_ORDER_SELECTORS` / - `PAYMENT_METHOD_SELECTORS` / `ORDER_TOTAL_SELECTORS` test skips until a - logged-in, PII-scrubbed review-page capture is committed here. +- **`costco_checkout_review.html`** — `dump-dom` stops at the product page, but + `roomieorder trace-order ` (DRY_RUN) now drives the flow to the review + page and dumps a `*_checkout_landed_dom.html` of SinglePageCheckoutView. Drop a + logged-in, PII-scrubbed capture from that step here (sanitize per the rule + above — the checkout body renders the member name, address, and card last-4) to + un-skip the `PLACE_ORDER_SELECTORS` / `PAYMENT_METHOD_SELECTORS` / + `ORDER_TOTAL_SELECTORS` test. - **Confirmation page** — only reachable past a real Place Order; it remains the project's standing 🔵 caveat (AGENTS.md §1), out of scope for this harness. diff --git a/tests/test_cli.py b/tests/test_cli.py index d21038c..1bffe9e 100644 --- a/tests/test_cli.py +++ b/tests/test_cli.py @@ -200,6 +200,76 @@ def test_verify_selectors_no_source_for_provider(env: Path, monkeypatch: pytest. assert "no items declare a amazon source" in result.output +# ─────────── trace-order arg handling + dry-run contract (no browser) ─────────── + + +def test_trace_order_unknown_item(env: Path) -> None: + result = CliRunner().invoke(main, ["trace-order", "nope"]) + assert result.exit_code != 0 + assert "unknown item_key" in result.output + + +def test_trace_order_no_source_for_provider(env: Path) -> None: + result = CliRunner().invoke(main, ["trace-order", "dish_soap", "--provider", "amazon"]) + assert result.exit_code != 0 + assert "no amazon source" in result.output + + +def test_trace_order_forces_dry_run_and_prints_steps( + env: Path, monkeypatch: pytest.MonkeyPatch +) -> None: + from roomieorder.purchase import PurchaseResult, TraceStep + + # DRY_RUN off in the env must NOT reach the buy: trace-order hard-forces it. + monkeypatch.setenv("DRY_RUN", "false") + seen: dict[str, object] = {} + + class _FakePurchaser: + config: object = None + + def _resolve_url(self, source: object) -> str: + return "https://example.test/p" + + def buy(self, item_key, item, source, proceed_check, *, tracer): # type: ignore[no-untyped-def] + seen["dry_run"] = self.config.dry_run # type: ignore[attr-defined] + seen["tracer"] = tracer + tracer.steps.append( + TraceStep( + name="product_loaded", + idx=1, + url="https://example.test/p", + summary="[price]\n sel count=1\n", + probe=Path("/tmp/p_probe.txt"), + ) + ) + tracer.steps.append( + TraceStep( + name="checkout_landed", + idx=2, + url="https://example.test/checkout", + summary="[place-order]\n sel count=1\n[order-total]\n sel count=1\n", + probe=Path("/tmp/c_probe.txt"), + ) + ) + return PurchaseResult(status="dry_run", unit_price=24.99, order_total=27.39) + + def _fake_purchaser_for(config: object, provider: str) -> object: + p = _FakePurchaser() + p.config = config + return p + + monkeypatch.setattr(cli, "_purchaser_for", _fake_purchaser_for) + result = CliRunner().invoke(main, ["trace-order", "paper_towels"]) + assert result.exit_code == 0, result.output + assert seen["dry_run"] is True # forced on despite DRY_RUN=false + assert "01 product_loaded" in result.output + assert "02 checkout_landed" in result.output + # The checkout step is where place-order/order-total finally resolve. + assert "place-order=ok" in result.output + assert "order-total=ok" in result.output + assert "status: dry_run" in result.output + + # ─────────── summary parsing helpers ─────────── _SAMPLE_SUMMARY = """\ diff --git a/tests/test_purchase.py b/tests/test_purchase.py index 3857efb..94b2b7a 100644 --- a/tests/test_purchase.py +++ b/tests/test_purchase.py @@ -4,17 +4,22 @@ # Test stubs are intentionally duck-typed fakes that implement only the Page # subset each test exercises; casting every call site would add noise. +from pathlib import Path + import pytest from roomieorder.catalog import load_catalog from roomieorder.config import Config from roomieorder.purchase import ( _JSONLD_SELECTOR, + _NULL_TRACER, AmazonPurchaser, CostcoPurchaser, + FlowTracer, _is_full_page_tag, _price_from_jsonld, looks_like, + new_run_id, parse_price, ) @@ -821,10 +826,114 @@ def test_cart_guard_base_hook_is_noop_for_amazon(config: Config) -> None: def test_amazon_start_checkout_clicks_buy_now(config: Config) -> None: page = _FakePage() page.present = {AmazonPurchaser.BUY_NOW_SELECTORS[0]} + # No tracer arg → the default no-op; the checkout flow is unchanged. assert _amazon(config)._start_checkout(page) is True assert page.clicked == [AmazonPurchaser.BUY_NOW_SELECTORS[0]] +# ─────────── flow tracer (full-flow dump for trace-order) ─────────── + + +class _RecordingTracer: + """Captures only the checkpoint names, in order — enough to assert that the + tracer is threaded through the buy-flow sub-steps.""" + + def __init__(self) -> None: + self.names: list[str] = [] + + def checkpoint(self, page: object, name: str) -> None: + self.names.append(name) + + +class _RecordingPurchaser: + """Duck-typed stand-in for the helpers FlowTracer.checkpoint calls, so the + tracer's orchestration (idx, file tags, best-effort) is testable without a + browser or a real purchaser.""" + + PROVIDER = "costco" + + def __init__(self, probe_raises: bool = False) -> None: + self.writes: list[tuple[str, str, str]] = [] + self.shots: list[str] = [] + self._probe_raises = probe_raises + + def is_logged_in(self, page: object) -> bool: + return True + + def _is_blocked(self, page: object, status: object = None) -> bool: + return False + + def _is_challenge(self, page: object) -> bool: + return False + + def _probe_selectors(self, page: object) -> str: + if self._probe_raises: + raise RuntimeError("boom") + return "[place-order]\n sel count=1\n" + + def _page_html(self, page: object) -> str: + return "" + + def _write_text(self, item_key: str, tag: str, ext: str, content: str): # type: ignore[no-untyped-def] + self.writes.append((item_key, tag, ext)) + return Path(f"/tmp/{tag}.{ext}") + + def _screenshot(self, page: object, item_key: str, tag: str): # type: ignore[no-untyped-def] + self.shots.append(tag) + return Path(f"/tmp/{tag}.png") + + +def test_null_tracer_checkpoint_is_noop() -> None: + # The default tracer on a live buy must do nothing and never raise. + _NULL_TRACER.checkpoint(_FakePage(), "product_loaded") + + +def test_new_run_id_is_trace_prefixed() -> None: + # The `trace` prefix groups a run's files and routes shots to full-page. + run_id = new_run_id() + assert run_id.startswith("trace") + assert _is_full_page_tag(f"{run_id}_01_product_loaded") is True + + +def test_flow_tracer_captures_each_checkpoint() -> None: + p = _RecordingPurchaser() + tracer = FlowTracer(p, "paper_towels", run_id="trace120000") + page = _FakePage(url="https://www.costco.com/checkout") + + tracer.checkpoint(page, "product_loaded") + tracer.checkpoint(page, "checkout_landed") + + assert [s.name for s in tracer.steps] == ["product_loaded", "checkout_landed"] + assert [s.idx for s in tracer.steps] == [1, 2] + # Tags carry the run_id, a zero-padded ordinal, and the checkpoint name. + assert ("paper_towels", "trace120000_01_product_loaded_dom", "html") in p.writes + assert ("paper_towels", "trace120000_02_checkout_landed_probe", "txt") in p.writes + assert "trace120000_02_checkout_landed" in p.shots + assert tracer.steps[0].summary.startswith("[place-order]") + assert tracer.steps[1].url == "https://www.costco.com/checkout" + + +def test_flow_tracer_survives_probe_failure() -> None: + # A probe error at one checkpoint must not abort the buy — it's recorded and + # the DOM/screenshot are still written. + p = _RecordingPurchaser(probe_raises=True) + tracer = FlowTracer(p, "paper_towels", run_id="trace1") + tracer.checkpoint(_FakePage(), "checkout_landed") + assert len(tracer.steps) == 1 + assert "probe failed" in tracer.steps[0].summary + assert any(tag.endswith("_dom") for _, tag, _ in p.writes) + assert p.shots == ["trace1_01_checkout_landed"] + + +def test_amazon_start_checkout_threads_tracer(config: Config) -> None: + # The tracer reaches the store-specific sub-steps: Buy Now fires one checkpoint. + page = _FakePage() + page.present = {AmazonPurchaser.BUY_NOW_SELECTORS[0]} + rec = _RecordingTracer() + assert _amazon(config)._start_checkout(page, tracer=rec) is True + assert rec.names == ["buy_now"] + + # ─────────── order-id extraction ───────────