Skip to content

Replace Slack with Lark for approval notifications#10

Merged
henryqingmo merged 10 commits into
mainfrom
slackToLark
May 28, 2026
Merged

Replace Slack with Lark for approval notifications#10
henryqingmo merged 10 commits into
mainfrom
slackToLark

Conversation

@henryqingmo
Copy link
Copy Markdown
Contributor

Summary

  • Replace Slack with Lark throughout: gateway notifier, webhook handler, mock service, eval suite, policy, and all tests
  • Wire real Lark approval flow end-to-end: Schema 2.0 callback support, HMAC verification for mock-lark, tenant token fetch with timeout
  • Fix LARK_API_BASE_URL docker-compose default — remove mock-lark hardcode so unset env routes to real Lark (open.feishu.cn); set LARK_API_BASE_URL=http://mock-lark:8090/open-apis in .env for local dev
  • Rename send_slack_messagesend_lark_message in policy, eval suites, fake servers, and tests
  • Rename scenario approval-timeout-slack-downapproval-timeout-lark-down across scenarios, suites, and UI
  • Fix LangGraph checkpoint corruption after approval denial: inject synthetic ToolMessages for orphaned tool_calls before the next agent turn
  • Add README Lark setup guide (ngrok, Developer Console steps, .env config)

Test plan

  • go test ./... passes
  • docker compose up -d --wait starts all services healthy
  • create_refund in demo-webapp triggers an interactive Lark card in the configured chat
  • Clicking Approve/Deny in Lark resolves the approval hold and the agent completes
  • Denying and retrying no longer produces INVALID_CHAT_HISTORY error
  • Eval UI at http://localhost:8099 shows MCP Crash / Retry Storm / Approval Timeout scenarios
  • Approval Timeout scenario passes with mock-lark stopped

🤖 Generated with Claude Code

Tom-Shuhong-Tang and others added 10 commits May 28, 2026 14:55
- Replace SlackClient/SlackNotifier with LarkClient/ApprovalNotifier
  using Lark's tenant_access_token + interactive card API
- Replace SlackWebhookHandler with LarkWebhookHandler at POST /lark/actions
  with SHA-256 signature verification and URL challenge support
- Add examples/mock-lark for local testing (replaces mock-slack)
- Update config env vars: SLACK_* → LARK_APP_ID/APP_SECRET/CHAT_ID/VERIFICATION_TOKEN
- Expose postgres:5432 and redis:6379 host ports in docker-compose
- Update all tests and mocks

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ark binary

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Keep Lark notifier/webhook, bring in ApprovalLockTTL from main
- NewRedisApprovalBridge now takes config.ApprovalLockTTL
- Postgres port stays 5432:5432 (host mapping added by slackToLark)
- Submodule pinned to 9fc10bc

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace SlackClient/SlackNotifier with LarkClient/ApprovalNotifier
  using Lark's tenant_access_token + interactive card API
- Replace SlackWebhookHandler with LarkWebhookHandler at POST /lark/actions
  with SHA-256 signature verification and URL challenge support
- Add examples/mock-lark for local testing (replaces mock-slack)
- Update config env vars: SLACK_* → LARK_APP_ID/APP_SECRET/CHAT_ID/VERIFICATION_TOKEN
- Expose postgres:5432 and redis:6379 host ports in docker-compose
- Update all tests and mocks

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…ark binary

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Slack → Lark migration (gateway + eval runner):
- Rename all Slack references to Lark throughout gateway config, webhook
  handler, policy gate, approval bridge, stack health, and tests
- Add LARK_APP_ID / LARK_APP_SECRET / LARK_CHAT_ID / LARK_VERIFICATION_TOKEN
  env vars; remove SLACK_* vars from config and docker-compose files
- Implement LarkNotifier in slack_webhook.go using Lark Open Platform APIs
  (tenant_access_token + im/v1/messages); keep HMAC signature verification
  for inbound callbacks on /lark/actions
- Update mock-lark example server and mock-slack to match new Lark API shape
- Update deploy/docker-compose.yml with mock-lark service and Lark env vars

Fix approval-timeout eval scenario:
- Add Lark reachability precondition check to approval-timeout scenario:
  fails fast with a clear message if mock-lark is still up (mirrors the
  existing MCP-crash precondition pattern); mock-lark auto-approves every
  ticket so the scenario can never reach 'expired' while it is running
- Add APPROVAL_TIMEOUT=25s to root docker-compose.yml gateway service
  (was missing; deploy/docker-compose.yml already had it); without this the
  5-minute default outlasts the eval runner's 90s poll window
- Wire STACK_HEALTH_SLACK_URL → scenarioDeps.larkURL in serve.go so the
  precondition probe uses the same address as the stack health check
- Update UI scenario description from "Slack is down" to "Lark is down"

localstripe_demo submodule (d652019):
- Bump to commit that fixes approval-flow race conditions in demo webapp:
  emit SSE done only after full event stream exhaustion, handle late-arriving
  tool_end badges, discourage dry_run=True in agent prompt and MCP docstring

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Resolved conflicts between local slackToLark (ApprovalTimeout field) and
remote slackToLark (ApprovalLockTTL field):
- Keep ApprovalTimeout / APPROVAL_TIMEOUT throughout (gateway config, main)
- Keep full larkCallbackEnvelope v1/v2 handler in slack_webhook.go
- Use hardcoded demo Lark credentials in root docker-compose.yml (with
  LARK_API_BASE_URL pointing to mock-lark) for local development
- Keep /healthz endpoint in mock-lark

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- slack_webhook: support Lark Schema 2.0 card.action.trigger callbacks
  (token-in-body auth, nested event.operator/event.action structure)
  alongside existing HMAC-signed mock-lark path
- slack_notifier: add 15s HTTP timeout and success log for card sends
- slack_webhook: log decision on approval/deny for observability
- docker-compose: parameterise Lark credentials via env vars with
  mock-lark fallbacks; fix port conflicts (postgres 15432, redis 16379);
  add eval-trigger service; remove mock-slack port collision
- docker-compose.override: drop stale mock-slack gateway dependency
- evalsuite/ai-agent: add missing policyOutcome to refund-intercepted case

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…dd README setup guide

- Rename all send_slack_message → send_lark_message (policy, evals, fake servers, tests)
- Rename approval-timeout-slack-down → approval-timeout-lark-down in scenarios/suites/UI
- Rename STACK_HEALTH_SLACK_URL → STACK_HEALTH_LARK_URL env var
- README: update services table, Scenario 3, add Real Lark approval setup section
- docker-compose: remove mock-lark default for LARK_API_BASE_URL so unset env routes
  to real Lark (open.feishu.cn); set LARK_API_BASE_URL=http://mock-lark:8090/open-apis
  in .env to use mock-lark for local dev
- Bump localstripe_demo submodule to include LangGraph checkpoint repair

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 28, 2026 16:39
@henryqingmo henryqingmo merged commit 30aa1b9 into main May 28, 2026
1 of 3 checks passed
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request migrates the approval notification system from Slack to Lark (Feishu), replacing the mock Slack services and tests with Lark equivalents, and introduces conditional rule evaluation based on tool arguments (when clauses) in the policy evaluator. Feedback on the changes highlights critical issues with Lark's API error handling, noting that business-level failures must be decoded and verified even when HTTP 200 is returned. Additionally, the signature verification logic in both the gateway and mock server needs to be corrected to match Lark's official concatenation order, and standard string comparisons should be replaced with subtle.ConstantTimeCompare to prevent timing attacks. Finally, the matchWhen helper should be updated to handle non-comparable types and numeric type mismatches robustly to avoid runtime panics.

Comment on lines +110 to +116
// larkSendMessageReq is the payload for Lark's im/v1/messages endpoint.
// Content is the JSON-encoded card string (Lark requires a JSON string, not object).
type larkSendMessageReq struct {
ReceiveID string `json:"receive_id"`
MsgType string `json:"msg_type"`
Content string `json:"content"`
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Lark Open APIs return HTTP 200 even for business-level failures, indicating the actual status via a code field in the JSON response body. To handle these errors properly, we should define a response structure to decode and verify the response code.

// larkSendMessageReq is the payload for Lark's im/v1/messages endpoint.
// Content is the JSON-encoded card string (Lark requires a JSON string, not object).
type larkSendMessageReq struct {
	ReceiveID string `json:"receive_id"`
	MsgType   string `json:"msg_type"`
	Content   string `json:"content"`
}

// larkSendMessageResp is the response from Lark's im/v1/messages endpoint.
type larkSendMessageResp struct {
	Code int    `json:"code"`
	Msg  string `json:"msg"`
}

Comment on lines 227 to 232
if resp.StatusCode != http.StatusOK {
return fmt.Errorf("slack notifier: unexpected status %d", resp.StatusCode)
body, _ := io.ReadAll(resp.Body)
return fmt.Errorf("lark notifier: unexpected status %d: %s", resp.StatusCode, body)
}

c.log.Info("lark approval card sent", "ticketID", ticketID, "chatID", c.chatID)
return nil
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Lark's message sending API returns HTTP 200 even when the message fails to deliver (e.g., due to invalid chat ID or missing permissions), returning a non-zero code in the JSON payload. We must decode the response body and verify that code == 0 to prevent silent failures.

	rawBody, err := io.ReadAll(resp.Body)
	if err != nil {
		return fmt.Errorf("lark notifier: read message response: %w", err)
	}
	if resp.StatusCode != http.StatusOK {
		return fmt.Errorf("lark notifier: unexpected status %d: %s", resp.StatusCode, rawBody)
	}

	var sendResp larkSendMessageResp
	if err := json.Unmarshal(rawBody, &sendResp); err != nil {
		return fmt.Errorf("lark notifier: decode message response: %w", err)
	}
	if sendResp.Code != 0 {
		return fmt.Errorf("lark notifier: message error code %d: %s", sendResp.Code, sendResp.Msg)
	}

	c.log.Info("lark approval card sent", "ticketID", ticketID, "chatID", c.chatID)
	return nil

Comment thread core/policy/evaluator.go
Comment on lines +46 to +54
func matchWhen(when map[string]any, args map[string]any) bool {
for k, wantVal := range when {
gotVal, ok := args[k]
if !ok || gotVal != wantVal {
return false
}
}
return true
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current implementation of matchWhen uses standard != comparison on any values. This has two major issues:

  1. Comparing non-comparable types (like slices or maps, which are common in JSON arguments) will cause a runtime panic.
  2. Numbers parsed from JSON are always float64, whereas numbers parsed from YAML are typically int or int64. This type mismatch causes numeric comparisons to silently fail (e.g., 100.0 != 100 evaluates to true).

Using reflect.DeepEqual and a numeric normalization helper solves both issues.

func matchWhen(when map[string]any, args map[string]any) bool {
	for k, wantVal := range when {
		gotVal, ok := args[k]
		if !ok {
			return false
		}
		if !compareValues(gotVal, wantVal) {
			return false
		}
	}
	return true
}

func compareValues(got, want any) bool {
	if reflect.DeepEqual(got, want) {
		return true
	}
	gotNum, gotOk := toFloat64(got)
	wantNum, wantOk := toFloat64(want)
	if gotOk && wantOk {
		return gotNum == wantNum
	}
	return false
}

func toFloat64(v any) (float64, bool) {
	switch val := v.(type) {
	case int:
		return float64(val), true
	case int8:
		return float64(val), true
	case int16:
		return float64(val), true
	case int32:
		return float64(val), true
	case int64:
		return float64(val), true
	case uint:
		return float64(val), true
	case uint8:
		return float64(val), true
	case uint16:
		return float64(val), true
	case uint32:
		return float64(val), true
	case uint64:
		return float64(val), true
	case float32:
		return float64(val), true
	case float64:
		return val, true
	}
	return 0, false
}

Comment on lines +124 to +133
// computeLarkSignature returns hex(sha256(verificationToken + timestamp + nonce + body)).
// Both the gateway and mock-lark use this formula so signatures are mutually verifiable.
func computeLarkSignature(verificationToken, timestamp, nonce string, body []byte) string {
h := sha256.New()
h.Write([]byte(verificationToken))
h.Write([]byte(timestamp))
h.Write([]byte(nonce))
h.Write(body)
return hex.EncodeToString(h.Sum(nil))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

The signature concatenation order in computeLarkSignature is incorrect. According to Lark's official specification, the signature must be computed as hex(sha256(timestamp + nonce + key + body)). The current implementation puts the key first, which will cause signature verification to fail for real Lark callbacks.

Suggested change
// computeLarkSignature returns hex(sha256(verificationToken + timestamp + nonce + body)).
// Both the gateway and mock-lark use this formula so signatures are mutually verifiable.
func computeLarkSignature(verificationToken, timestamp, nonce string, body []byte) string {
h := sha256.New()
h.Write([]byte(verificationToken))
h.Write([]byte(timestamp))
h.Write([]byte(nonce))
h.Write(body)
return hex.EncodeToString(h.Sum(nil))
}
// computeLarkSignature returns hex(sha256(timestamp + nonce + verificationToken + body)).
// This matches Lark's official signature verification specification.
func computeLarkSignature(verificationToken, timestamp, nonce string, body []byte) string {
h := sha256.New()
h.Write([]byte(timestamp))
h.Write([]byte(nonce))
h.Write([]byte(verificationToken))
h.Write(body)
return hex.EncodeToString(h.Sum(nil))
}

Comment on lines +102 to +111
// computeSignature computes the Lark-compatible HMAC signature used by the gateway.
// Formula: hex(sha256(verificationToken + timestamp + nonce + body))
func computeSignature(token, timestamp, nonce string, body []byte) string {
h := sha256.New()
h.Write([]byte(token))
h.Write([]byte(timestamp))
h.Write([]byte(nonce))
h.Write(body)
return hex.EncodeToString(h.Sum(nil))
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-high high

Update the signature computation formula in the mock server to match Lark's official specification (timestamp + nonce + token + body) so that it remains in sync with the gateway's corrected signature verification.

Suggested change
// computeSignature computes the Lark-compatible HMAC signature used by the gateway.
// Formula: hex(sha256(verificationToken + timestamp + nonce + body))
func computeSignature(token, timestamp, nonce string, body []byte) string {
h := sha256.New()
h.Write([]byte(token))
h.Write([]byte(timestamp))
h.Write([]byte(nonce))
h.Write(body)
return hex.EncodeToString(h.Sum(nil))
}
// computeSignature computes the Lark-compatible signature used by the gateway.
// Formula: hex(sha256(timestamp + nonce + token + body))
func computeSignature(token, timestamp, nonce string, body []byte) string {
h := sha256.New()
h.Write([]byte(timestamp))
h.Write([]byte(nonce))
h.Write([]byte(token))
h.Write(body)
return hex.EncodeToString(h.Sum(nil))
}

Comment on lines 3 to 12
import (
"context"
"crypto/hmac"
"crypto/sha256"
"encoding/hex"
"encoding/json"
"io"
"log/slog"
"net/http"
"net/url"
"strconv"
"time"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

We should import "crypto/subtle" to perform constant-time signature comparisons, mitigating potential timing attacks on the webhook signature verification.

Suggested change
import (
"context"
"crypto/hmac"
"crypto/sha256"
"encoding/hex"
"encoding/json"
"io"
"log/slog"
"net/http"
"net/url"
"strconv"
"time"
import (
"context"
"crypto/sha256"
"crypto/subtle"
"encoding/hex"
"encoding/json"
"io"
"log/slog"
"net/http"
"strconv"
"time"

Comment on lines +227 to +232
providedSig := r.Header.Get("X-Lark-Signature")
if expectedSig != providedSig {
h.log.Warn("lark webhook: signature mismatch")
http.Error(w, "bad request", http.StatusBadRequest)
return
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

security-medium medium

Comparing cryptographic signatures using standard string comparison (!=) is vulnerable to timing attacks. Use subtle.ConstantTimeCompare to ensure constant-time comparison.

Suggested change
providedSig := r.Header.Get("X-Lark-Signature")
if expectedSig != providedSig {
h.log.Warn("lark webhook: signature mismatch")
http.Error(w, "bad request", http.StatusBadRequest)
return
}
providedSig := r.Header.Get("X-Lark-Signature")
if subtle.ConstantTimeCompare([]byte(expectedSig), []byte(providedSig)) != 1 {
h.log.Warn("lark webhook: signature mismatch")
http.Error(w, "bad request", http.StatusBadRequest)
return
}

@henryqingmo henryqingmo review requested due to automatic review settings May 28, 2026 17:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants