Skip to content

feat: Lark/Feishu browser connector (QR login, real-user message capture)#426

Draft
seasidespace wants to merge 20 commits into
berabuddies:masterfrom
seasidespace:feat/lark-browser-connector
Draft

feat: Lark/Feishu browser connector (QR login, real-user message capture)#426
seasidespace wants to merge 20 commits into
berabuddies:masterfrom
seasidespace:feat/lark-browser-connector

Conversation

@seasidespace

Copy link
Copy Markdown
Contributor

Summary

Adds a lark-browser + feishu-browser connector: log into Lark/Feishu by scanning a QR code, then capture your incoming and outgoing chat messages into the monitor — all as a real user account (no bot, no API rate limits). Modeled on the existing gmail-browser connector.

Why

The existing lark-cli connector uses Lark's OpenAPI, which (a) has REST rate limits and (b) only acts as a bot, not as the real you. We wanted the WeChat model: log in as yourself with a QR code and read/send through the real client.

WeChat had to run its native client in a Docker container because WeChat has no web app. Lark/Feishu do have full web apps, and Puffer already ships an embedded browser (CEF) — so the right tool here is a browser-backed connector, not a container. Lighter, reuses proven infra, and avoids the whole Docker failure surface.

How it works (plain terms)

  1. Login: open the Lark/Feishu web app in Puffer's built-in browser; it shows a QR code, you scan it with your phone. Login is saved in the browser profile.
  2. Read (every ~30s, by looking at the web page's DOM):
    • the conversation list — every chat, last message, unread (broad coverage)
    • the open conversation via a tiny in-page watcher — exact per-message id + direction (incoming vs outgoing), so your own replies and phone-sent messages are caught too
  3. Emit: new messages are de-duplicated and handed to the monitor with is_outgoing + chat_id, so incoming messages can create tasks and your replies can auto-complete them.
  4. Act: send_message / read_history / react drive the same web page.

Two brands share one implementation, differing only by URL: lark-browserweb.larksuite.com, feishu-browserweb.feishu.cn.

Why the DOM and not the API/network? A spike found Lark's realtime channel is protobuf and its local store is an encrypted WASM-SQLite blob — both as hard as cracking a native DB. The rendered page is the practical, stable read surface. (We depend on a few stable CSS/data-* hooks and deliberately avoid build-hashed classes.)

What happened (the journey)

  • Built in ~12 small, reviewed steps (each: implement → tests → independent review). Pure logic (parsing, dedup, direction, first-poll baseline, both-brand routing) is unit-tested (~50 tests); browser glue is verified manually.
  • A whole-branch review caught a real gap no per-step review owned: the connector wasn't registered in the catalog and didn't persist its config, so it could never actually run — fixed.
  • Live testing with the real desktop app + a real Lark account then surfaced three more bugs, each root-caused by inspecting the running browser:
    1. Login hung after the QR scan — the check waited for the messenger page, but Lark lands on Drive after login. Fix: detect login by being on the tenant host.
    2. Subscriber never polled — its config dropped a workspace_root field, so it looked for the browser engine in the wrong place. Fix: persist + use it.
    3. Feed kept flapping — it re-opened the web root each poll (→ Drive) and fought its own navigation. Fix: open the /messenger/ entry directly.
  • After those, capture is stable for both brands (verified live: messages captured and emitted).

Status / what's left (why this is a draft)

  • ✅ Login, stable polling, message capture — both brands, verified live
  • ⏳ Hardening: feed name/preview + editor/send/react selectors are best-effort; direction detection currently keys on an English "You:" preview prefix and should also recognize the localized "你:"
  • ⏳ Full incoming-actionable → monitor-task confirmation (the monitor triages: outgoing only auto-completes, incoming only creates a task if actionable)
  • Carries one unrelated chore: ignore .worktrees commit from the base branch.

Test plan

  • /connect lark-browser and /connect feishu-browser — QR login both brands
  • Receive a substantive incoming message → creates a monitor task
  • Reply (incl. from phone) → auto-completes the matching task
  • send_message / read_history / react on both brands
  • First poll seeds without flooding the monitor

🤖 Generated with Claude Code

seasidespace and others added 20 commits June 17, 2026 19:57
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…emit

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ed in Task 7)

These 7 items (LarkBrowserConfig, SeenState, feed_fingerprint, feed_dedup_key,
should_emit_feed, build_message_event, now_ms) are wired into run_subscriber in
the next task (Task 7); the attribute is removed there once integrated.
…with optimistic-id reconciliation

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wire LARK_OBSERVER_INSTALL_JS/DRAIN_JS into each poll after the feed pass.
Add process_active_drain pure helper (seeds on first poll, emits snowflake
ids post-init, drops optimistic temp ids). Feed pass now accepts
active_chat_id and suppresses rows for the open chat to prevent
double-emit. Remove 5 #[allow(dead_code)] attrs. 18/18 lark_browser
tests pass; 0 lark dead-code warnings.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds lark_browser_actions.rs with handle_action dispatching on
send_message / read_history / react. Pure field-parsers are TDD-tested
(12 passing). Browser glue uses only stable hooks ([data-feed-id],
.js-message-item, .lark__editor, .send__button, [contenteditable]).
Wires SubscriberCommand::Custom { op == "lark_browser_act" } into
lark_browser.rs mirroring the gmail_browser_act pattern.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…an poll (both brands)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…g + persist config.toml at the runtime state dir

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ssenger/ (web root lands on Drive, not the feed)
…nects to the running browser daemon (not the manifest-dir cwd)
…Drive and re-opening it each poll fought the feed-nav)
@vercel

vercel Bot commented Jun 19, 2026

Copy link
Copy Markdown

@seasidespace is attempting to deploy a commit to the fuzzland Team on Vercel.

A member of the Team first needs to authorize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant