Skip to content

feat(aggregator): scraper daemon + dual-mode write-policy (#1046)#1069

Open
c03rad0r wants to merge 4 commits into
PlebeianApp:feat/market-agg-relayfrom
c03rad0r:feat/market-agg-scraper
Open

feat(aggregator): scraper daemon + dual-mode write-policy (#1046)#1069
c03rad0r wants to merge 4 commits into
PlebeianApp:feat/market-agg-relayfrom
c03rad0r:feat/market-agg-scraper

Conversation

@c03rad0r

@c03rad0r c03rad0r commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

What

Adds an active scraper daemon to the aggregator relay from #1066, so it continuously mirrors marketplace events from upstream relays into one fast local relay — instead of only storing events explicitly published to it. The market app then reads from a single, pre-populated relay. Also adds a write-policy that accepts all marketplace event kinds.

Stacked on #1066 (the aggregator relay), which is still OPEN. Once #1066 merges, GitHub collapses this diff to just the scraper changes.

Why

The aggregator relay deployed in #1066 only holds events explicitly published to it. Without active scraping it would miss most marketplace activity — listings, profiles, ratings, etc. live spread across many public relays. The scraper pre-fetches and caches them so the app's reads resolve against one healthy local relay instead of fanning out across potentially-slow public relays.

Refs #1046. Depends on #1066.

Changes

File Change
scraper.py (new) Long-lived daemon. Bootstraps from the configured seed npub's kind 3 + 10002, discovers each contact's relay list, opens a worker per relay holding chunked-author + #p subscriptions, and republishes every matching event into strfry. Tracks discovered pubkeys (capped via MAX_PUBKEYS) and prunes stale entries hourly.
write-policy.py Accepts all public marketplace event kinds (0/1/3/7/9735/1985/10000/10002, 1023–1026, 30402/30405/06/08, 30440–30442, 31555/31990) so scraped market data is stored; rejects non-market kinds. Hot-reloads on config change.
tests/ (new) pytest suite — 108 tests. write-policy: market-kind acceptance + the full stdin/stdout strfry plugin protocol (case-insensitive seed, garbage rejection) exercised via subprocess. scraper: pubkey tracking/harvest with MAX_PUBKEYS cap, bounded-LRU event dedup, kind-3 follow + kind-10002 relay-list parsing, pruning, discovery, and the relay_worker mirror path (subscription construction + deduped republish) with a stubbed websocket.
Dockerfile Add py3-websocket-client (the websocket-client lib scraper.py imports).
docker-compose.yml Add scraper service (depends on strfry-market-agg, read-write state volume, internal ws://strfry-market-agg:7777 URL).
README.md Document scraper architecture, data flow, and configuration.
strfry.conf / .env.example Scraper + relay configuration.

Architecture

[Market App] --reads--> [strfry :7777] <--write-policy
                              ^
                              | republish
                              |
                     [scraper daemon] --scrape--> damus / nos.lol / plebeian + discovered relays

Test plan

  • pytest deploy-simple/aggregator/ -q108 passed — write-policy market-kind acceptance (via a real subprocess exercising the stdin/stdout plugin protocol) and scraper logic + relay_worker mirror path with a stubbed websocket module (no network).
  • Both .py files py_compile clean; docker-compose.yml validated.
  • Live (post-merge): scraper connects to seed relays, fetches the seed npub's network (kind 3 + 10002), republishes marketplace events to strfry, and stays current on the maintain timer.

c03rad0r added a commit to c03rad0r/market that referenced this pull request Jun 26, 2026
Resolves prettier CI failures flagged in review of PRs PlebeianApp#1066/PlebeianApp#1069
(see t_f79d730e). Touches only files already in the env-config fix:

- src/lib/stores/ndk.ts: wrap over-width constants import; collapse
  primaryAgg ternary to single line (fits print width)
- deploy-simple/aggregator/docker-compose.yml: single-quote ports mapping
  (pre-existing double-quote violation)
c03rad0r added a commit to c03rad0r/market that referenced this pull request Jun 26, 2026
…App#1069 + RELAY_PLAN.md

Reconcile the write-policy conflict between PR PlebeianApp#1066 (market-kind gate)
and PR PlebeianApp#1069 (dual-mode gate + scraper):

- Adopt dual-mode gate from PlebeianApp#1069 as final design: PUBLIC market kinds
  from any pubkey, RESTRICTED kinds (gift wraps, orders, wallets) from
  root + WoT only
- Merge expanded kind set from PlebeianApp#1066 into the dual-mode structure:
  adds 4 (DMs), 5 (deletions), 1111 (comments), 30018 (legacy products),
  30000 (app settings), 25910 (ctxvm), 31989 (NIP-89 handler rec)
- Add RELAY_PLAN.md from PlebeianApp#1066 (updated for dual-mode)
- Root npub bootstrap retained for all kinds

This ensures the marketplace shows all sellers' public data while keeping
private order/payment data gated to trusted pubkeys.

Refs PlebeianApp#1046, PlebeianApp#1066, PlebeianApp#1069
c03rad0r added a commit to c03rad0r/market that referenced this pull request Jun 26, 2026
…dual-mode gate

Add 7 market-relevant kinds missing from the scraper branch's
PUBLIC_MARKET_KINDS, discovered in the PlebeianApp#1066 codebase audit:

- 4 (DMs/NIP-04), 5 (deletions), 1111 (comments)
- 30018 (NIP-15 legacy products), 30000 (app settings/vanity/NIP-05)
- 25910 (ctxvm messages), 31989 (NIP-89 handler recommendation)

Also add RELAY_PLAN.md documenting the two-tier relay topology and
dual-mode gate policy.

The dual-mode design from PlebeianApp#1069 is preserved: PUBLIC kinds from any
pubkey, RESTRICTED kinds (gift wraps, orders, wallets, app data) from
root + WoT only.

Refs PlebeianApp#1046, PlebeianApp#1066, PlebeianApp#1069
c03rad0r added 2 commits June 27, 2026 13:06
This test file was added in the prettier commit but targets an
applesauce-relay + RelayLiveness client architecture that the committed
src/lib/ctxcn-client.ts does not implement (it uses nostr-tools
SimplePool). The class API it asserts against — healthyRelays(),
allRelaysUnhealthy(), RelayLiveness.failover — does not exist in the
current client, so 21 of 23 tests fail (the 3 surfaced by CI at lines
297/310/323 plus 18 more), all with 'pool/liveness is undefined'.

It is a duplicate of contextvm-client.test.ts, which already covers the
real PlebianCurrencyClient against nostr-tools (6/6 pass). The relay
aggregator is unrelated to the currency client, so this file is simply
an accidental inclusion.

Removing it: full test:unit suite goes 103 pass / 0 fail.

Refs PlebeianApp#1046, PlebeianApp#1066
Reframe the MARKET_AGGREGATOR_RELAY doc comment as a read-only caching
relay for marketplace events rather than a 'WoT-gated' relay, matching
the product framing of this PR.

Refs PlebeianApp#1046, PlebeianApp#1066
c03rad0r added a commit to c03rad0r/market that referenced this pull request Jun 27, 2026
…anApp#1069)

108 tests covering:
- write-policy dual-mode gate: PUBLIC market kinds accepted from anyone,
  RESTRICTED kinds (1059/1060/30078/13/14/16/17/17375) gated to root npub +
  WoT allowlist, unknown kinds rejected; allowlist hot-reload on mtime change;
  full stdin/stdout strfry plugin protocol via subprocess.
- scraper daemon: pubkey tracking/harvest with MAX_PUBKEYS cap, bounded LRU
  event dedup, kind-3 follow + kind-10002 relay-list parsing, WoT export,
  stale pruning, relay-index discovery, and the relay_worker mirror path
  (chunked-author + #p subscriptions over SCRAPE_KINDS, deduped republish)
  with a stubbed websocket module.

pytest.ini adds pythonpath=. so the flat aggregator modules are importable;
write-policy.py (hyphenated) is loaded via importlib.
@c03rad0r c03rad0r force-pushed the feat/market-agg-scraper branch from cdead97 to 8a489c8 Compare June 27, 2026 07:59
@c03rad0r c03rad0r changed the base branch from master to feat/market-agg-relay June 27, 2026 08:00
…p#1046)

Active scraper daemon that mirrors marketplace events into the aggregator
relay from PlebeianApp#1066, plus a dual-mode write-policy: public market kinds
accepted from anyone, restricted kinds gated to the operator + allowlist.

Stacked on feat/market-agg-relay (PlebeianApp#1066).
@c03rad0r c03rad0r force-pushed the feat/market-agg-scraper branch from 8a489c8 to fde91ff Compare June 27, 2026 11:41
@c03rad0r

Copy link
Copy Markdown
Contributor Author

ℹ️ Status note: This PR is stacked on #1066 (feat/market-agg-relay). CI does not run for stacked branches. Once #1066 merges, this will be retargeted to master and CI will trigger automatically. Reviewers can review the diff now — the scraper daemon and write-policy logic are self-contained.

c03rad0r added a commit to c03rad0r/market that referenced this pull request Jun 28, 2026
…App#1069 + RELAY_PLAN.md

Reconcile the write-policy conflict between PR PlebeianApp#1066 (market-kind gate)
and PR PlebeianApp#1069 (dual-mode gate + scraper):

- Adopt dual-mode gate from PlebeianApp#1069 as final design: PUBLIC market kinds
  from any pubkey, RESTRICTED kinds (gift wraps, orders, wallets) from
  root + WoT only
- Merge expanded kind set from PlebeianApp#1066 into the dual-mode structure:
  adds 4 (DMs), 5 (deletions), 1111 (comments), 30018 (legacy products),
  30000 (app settings), 25910 (ctxvm), 31989 (NIP-89 handler rec)
- Add RELAY_PLAN.md from PlebeianApp#1066 (updated for dual-mode)
- Root npub bootstrap retained for all kinds

This ensures the marketplace shows all sellers' public data while keeping
private order/payment data gated to trusted pubkeys.

Refs PlebeianApp#1046, PlebeianApp#1066, PlebeianApp#1069
@c03rad0r c03rad0r closed this Jun 28, 2026
@c03rad0r c03rad0r reopened this Jun 28, 2026
@c03rad0r c03rad0r added the theme:performance Relay/aggregator/query performance work label Jun 29, 2026
@c03rad0r

Copy link
Copy Markdown
Contributor Author

📋 Stacking chain reference: Chain B (Relay/Aggregator): #1066this PR

Full review-order map with all stacking chains: #1088

(Automated tag — see issue for review priority recommendations.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

theme:performance Relay/aggregator/query performance work

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants