Replace Slack with Lark for approval notifications by henryqingmo · Pull Request #10 · K8Harness/ToolGate

henryqingmo · 2026-05-28T16:39:02Z

Summary

Replace Slack with Lark throughout: gateway notifier, webhook handler, mock service, eval suite, policy, and all tests
Wire real Lark approval flow end-to-end: Schema 2.0 callback support, HMAC verification for mock-lark, tenant token fetch with timeout
Fix LARK_API_BASE_URL docker-compose default — remove mock-lark hardcode so unset env routes to real Lark (open.feishu.cn); set LARK_API_BASE_URL=http://mock-lark:8090/open-apis in .env for local dev
Rename send_slack_message → send_lark_message in policy, eval suites, fake servers, and tests
Rename scenario approval-timeout-slack-down → approval-timeout-lark-down across scenarios, suites, and UI
Fix LangGraph checkpoint corruption after approval denial: inject synthetic ToolMessages for orphaned tool_calls before the next agent turn
Add README Lark setup guide (ngrok, Developer Console steps, .env config)

Test plan

go test ./... passes
docker compose up -d --wait starts all services healthy
create_refund in demo-webapp triggers an interactive Lark card in the configured chat
Clicking Approve/Deny in Lark resolves the approval hold and the agent completes
Denying and retrying no longer produces INVALID_CHAT_HISTORY error
Eval UI at http://localhost:8099 shows MCP Crash / Retry Storm / Approval Timeout scenarios
Approval Timeout scenario passes with mock-lark stopped

🤖 Generated with Claude Code

- Replace SlackClient/SlackNotifier with LarkClient/ApprovalNotifier using Lark's tenant_access_token + interactive card API - Replace SlackWebhookHandler with LarkWebhookHandler at POST /lark/actions with SHA-256 signature verification and URL challenge support - Add examples/mock-lark for local testing (replaces mock-slack) - Update config env vars: SLACK_* → LARK_APP_ID/APP_SECRET/CHAT_ID/VERIFICATION_TOKEN - Expose postgres:5432 and redis:6379 host ports in docker-compose - Update all tests and mocks Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ark binary Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Keep Lark notifier/webhook, bring in ApprovalLockTTL from main - NewRedisApprovalBridge now takes config.ApprovalLockTTL - Postgres port stays 5432:5432 (host mapping added by slackToLark) - Submodule pinned to 9fc10bc Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Replace SlackClient/SlackNotifier with LarkClient/ApprovalNotifier using Lark's tenant_access_token + interactive card API - Replace SlackWebhookHandler with LarkWebhookHandler at POST /lark/actions with SHA-256 signature verification and URL challenge support - Add examples/mock-lark for local testing (replaces mock-slack) - Update config env vars: SLACK_* → LARK_APP_ID/APP_SECRET/CHAT_ID/VERIFICATION_TOKEN - Expose postgres:5432 and redis:6379 host ports in docker-compose - Update all tests and mocks Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…ark binary Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Slack → Lark migration (gateway + eval runner): - Rename all Slack references to Lark throughout gateway config, webhook handler, policy gate, approval bridge, stack health, and tests - Add LARK_APP_ID / LARK_APP_SECRET / LARK_CHAT_ID / LARK_VERIFICATION_TOKEN env vars; remove SLACK_* vars from config and docker-compose files - Implement LarkNotifier in slack_webhook.go using Lark Open Platform APIs (tenant_access_token + im/v1/messages); keep HMAC signature verification for inbound callbacks on /lark/actions - Update mock-lark example server and mock-slack to match new Lark API shape - Update deploy/docker-compose.yml with mock-lark service and Lark env vars Fix approval-timeout eval scenario: - Add Lark reachability precondition check to approval-timeout scenario: fails fast with a clear message if mock-lark is still up (mirrors the existing MCP-crash precondition pattern); mock-lark auto-approves every ticket so the scenario can never reach 'expired' while it is running - Add APPROVAL_TIMEOUT=25s to root docker-compose.yml gateway service (was missing; deploy/docker-compose.yml already had it); without this the 5-minute default outlasts the eval runner's 90s poll window - Wire STACK_HEALTH_SLACK_URL → scenarioDeps.larkURL in serve.go so the precondition probe uses the same address as the stack health check - Update UI scenario description from "Slack is down" to "Lark is down" localstripe_demo submodule (d652019): - Bump to commit that fixes approval-flow race conditions in demo webapp: emit SSE done only after full event stream exhaustion, handle late-arriving tool_end badges, discourage dry_run=True in agent prompt and MCP docstring Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Resolved conflicts between local slackToLark (ApprovalTimeout field) and remote slackToLark (ApprovalLockTTL field): - Keep ApprovalTimeout / APPROVAL_TIMEOUT throughout (gateway config, main) - Keep full larkCallbackEnvelope v1/v2 handler in slack_webhook.go - Use hardcoded demo Lark credentials in root docker-compose.yml (with LARK_API_BASE_URL pointing to mock-lark) for local development - Keep /healthz endpoint in mock-lark Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- slack_webhook: support Lark Schema 2.0 card.action.trigger callbacks (token-in-body auth, nested event.operator/event.action structure) alongside existing HMAC-signed mock-lark path - slack_notifier: add 15s HTTP timeout and success log for card sends - slack_webhook: log decision on approval/deny for observability - docker-compose: parameterise Lark credentials via env vars with mock-lark fallbacks; fix port conflicts (postgres 15432, redis 16379); add eval-trigger service; remove mock-slack port collision - docker-compose.override: drop stale mock-slack gateway dependency - evalsuite/ai-agent: add missing policyOutcome to refund-intercepted case Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…dd README setup guide - Rename all send_slack_message → send_lark_message (policy, evals, fake servers, tests) - Rename approval-timeout-slack-down → approval-timeout-lark-down in scenarios/suites/UI - Rename STACK_HEALTH_SLACK_URL → STACK_HEALTH_LARK_URL env var - README: update services table, Scenario 3, add Real Lark approval setup section - docker-compose: remove mock-lark default for LARK_API_BASE_URL so unset env routes to real Lark (open.feishu.cn); set LARK_API_BASE_URL=http://mock-lark:8090/open-apis in .env to use mock-lark for local dev - Bump localstripe_demo submodule to include LangGraph checkpoint repair Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

gemini-code-assist

Code Review

This pull request migrates the approval notification system from Slack to Lark (Feishu), replacing the mock Slack services and tests with Lark equivalents, and introduces conditional rule evaluation based on tool arguments (when clauses) in the policy evaluator. Feedback on the changes highlights critical issues with Lark's API error handling, noting that business-level failures must be decoded and verified even when HTTP 200 is returned. Additionally, the signature verification logic in both the gateway and mock server needs to be corrected to match Lark's official concatenation order, and standard string comparisons should be replaced with subtle.ConstantTimeCompare to prevent timing attacks. Finally, the matchWhen helper should be updated to handle non-comparable types and numeric type mismatches robustly to avoid runtime panics.

gemini-code-assist · 2026-05-28T16:41:23Z

+// larkSendMessageReq is the payload for Lark's im/v1/messages endpoint.
+// Content is the JSON-encoded card string (Lark requires a JSON string, not object).
+type larkSendMessageReq struct {
+	ReceiveID string `json:"receive_id"`
+	MsgType   string `json:"msg_type"`
+	Content   string `json:"content"`
+}


Lark Open APIs return HTTP 200 even for business-level failures, indicating the actual status via a code field in the JSON response body. To handle these errors properly, we should define a response structure to decode and verify the response code.

// larkSendMessageReq is the payload for Lark's im/v1/messages endpoint. // Content is the JSON-encoded card string (Lark requires a JSON string, not object). type larkSendMessageReq struct { ReceiveID string `json:"receive_id"` MsgType string `json:"msg_type"` Content string `json:"content"` } // larkSendMessageResp is the response from Lark's im/v1/messages endpoint. type larkSendMessageResp struct { Code int `json:"code"` Msg string `json:"msg"` }

gemini-code-assist · 2026-05-28T16:41:23Z

 	if resp.StatusCode != http.StatusOK {
-		return fmt.Errorf("slack notifier: unexpected status %d", resp.StatusCode)
+		body, _ := io.ReadAll(resp.Body)
+		return fmt.Errorf("lark notifier: unexpected status %d: %s", resp.StatusCode, body)
 	}
-
+	c.log.Info("lark approval card sent", "ticketID", ticketID, "chatID", c.chatID)
 	return nil


Lark's message sending API returns HTTP 200 even when the message fails to deliver (e.g., due to invalid chat ID or missing permissions), returning a non-zero code in the JSON payload. We must decode the response body and verify that code == 0 to prevent silent failures.

rawBody, err := io.ReadAll(resp.Body) if err != nil { return fmt.Errorf("lark notifier: read message response: %w", err) } if resp.StatusCode != http.StatusOK { return fmt.Errorf("lark notifier: unexpected status %d: %s", resp.StatusCode, rawBody) } var sendResp larkSendMessageResp if err := json.Unmarshal(rawBody, &sendResp); err != nil { return fmt.Errorf("lark notifier: decode message response: %w", err) } if sendResp.Code != 0 { return fmt.Errorf("lark notifier: message error code %d: %s", sendResp.Code, sendResp.Msg) } c.log.Info("lark approval card sent", "ticketID", ticketID, "chatID", c.chatID) return nil

gemini-code-assist · 2026-05-28T16:41:23Z

+func matchWhen(when map[string]any, args map[string]any) bool {
+	for k, wantVal := range when {
+		gotVal, ok := args[k]
+		if !ok || gotVal != wantVal {
+			return false
+		}
+	}
+	return true
+}


The current implementation of matchWhen uses standard != comparison on any values. This has two major issues:

Comparing non-comparable types (like slices or maps, which are common in JSON arguments) will cause a runtime panic.

Numbers parsed from JSON are always float64, whereas numbers parsed from YAML are typically int or int64. This type mismatch causes numeric comparisons to silently fail (e.g., 100.0 != 100 evaluates to true).

Using reflect.DeepEqual and a numeric normalization helper solves both issues.

func matchWhen(when map[string]any, args map[string]any) bool { for k, wantVal := range when { gotVal, ok := args[k] if !ok { return false } if !compareValues(gotVal, wantVal) { return false } } return true } func compareValues(got, want any) bool { if reflect.DeepEqual(got, want) { return true } gotNum, gotOk := toFloat64(got) wantNum, wantOk := toFloat64(want) if gotOk && wantOk { return gotNum == wantNum } return false } func toFloat64(v any) (float64, bool) { switch val := v.(type) { case int: return float64(val), true case int8: return float64(val), true case int16: return float64(val), true case int32: return float64(val), true case int64: return float64(val), true case uint: return float64(val), true case uint8: return float64(val), true case uint16: return float64(val), true case uint32: return float64(val), true case uint64: return float64(val), true case float32: return float64(val), true case float64: return val, true } return 0, false }

gemini-code-assist · 2026-05-28T16:41:23Z

+// computeLarkSignature returns hex(sha256(verificationToken + timestamp + nonce + body)).
+// Both the gateway and mock-lark use this formula so signatures are mutually verifiable.
+func computeLarkSignature(verificationToken, timestamp, nonce string, body []byte) string {
+	h := sha256.New()
+	h.Write([]byte(verificationToken))
+	h.Write([]byte(timestamp))
+	h.Write([]byte(nonce))
+	h.Write(body)
+	return hex.EncodeToString(h.Sum(nil))
+}


The signature concatenation order in computeLarkSignature is incorrect. According to Lark's official specification, the signature must be computed as hex(sha256(timestamp + nonce + key + body)). The current implementation puts the key first, which will cause signature verification to fail for real Lark callbacks.

Suggested change

// computeLarkSignature returns hex(sha256(verificationToken + timestamp + nonce + body)).

// Both the gateway and mock-lark use this formula so signatures are mutually verifiable.

func computeLarkSignature(verificationToken, timestamp, nonce string, body []byte) string {

h := sha256.New()

h.Write([]byte(verificationToken))

h.Write([]byte(timestamp))

h.Write([]byte(nonce))

h.Write(body)

return hex.EncodeToString(h.Sum(nil))

}

// computeLarkSignature returns hex(sha256(timestamp + nonce + verificationToken + body)).

// This matches Lark's official signature verification specification.

func computeLarkSignature(verificationToken, timestamp, nonce string, body []byte) string {

h := sha256.New()

h.Write([]byte(timestamp))

h.Write([]byte(nonce))

h.Write([]byte(verificationToken))

h.Write(body)

return hex.EncodeToString(h.Sum(nil))

}

gemini-code-assist · 2026-05-28T16:41:23Z

+// computeSignature computes the Lark-compatible HMAC signature used by the gateway.
+// Formula: hex(sha256(verificationToken + timestamp + nonce + body))
+func computeSignature(token, timestamp, nonce string, body []byte) string {
+	h := sha256.New()
+	h.Write([]byte(token))
+	h.Write([]byte(timestamp))
+	h.Write([]byte(nonce))
+	h.Write(body)
+	return hex.EncodeToString(h.Sum(nil))
+}


Update the signature computation formula in the mock server to match Lark's official specification (timestamp + nonce + token + body) so that it remains in sync with the gateway's corrected signature verification.

Suggested change

// computeSignature computes the Lark-compatible HMAC signature used by the gateway.

// Formula: hex(sha256(verificationToken + timestamp + nonce + body))

func computeSignature(token, timestamp, nonce string, body []byte) string {

h := sha256.New()

h.Write([]byte(token))

h.Write([]byte(timestamp))

h.Write([]byte(nonce))

h.Write(body)

return hex.EncodeToString(h.Sum(nil))

}

// computeSignature computes the Lark-compatible signature used by the gateway.

// Formula: hex(sha256(timestamp + nonce + token + body))

func computeSignature(token, timestamp, nonce string, body []byte) string {

h := sha256.New()

h.Write([]byte(timestamp))

h.Write([]byte(nonce))

h.Write([]byte(token))

h.Write(body)

return hex.EncodeToString(h.Sum(nil))

}

gemini-code-assist · 2026-05-28T16:41:23Z

 import (
 	"context"
-	"crypto/hmac"
 	"crypto/sha256"
 	"encoding/hex"
 	"encoding/json"
 	"io"
 	"log/slog"
 	"net/http"
-	"net/url"
 	"strconv"
 	"time"


We should import "crypto/subtle" to perform constant-time signature comparisons, mitigating potential timing attacks on the webhook signature verification.

Suggested change

import (

"context"

"crypto/hmac"

"crypto/sha256"

"encoding/hex"

"encoding/json"

"io"

"log/slog"

"net/http"

"net/url"

"strconv"

"time"

import (

"context"

"crypto/sha256"

"crypto/subtle"

"encoding/hex"

"encoding/json"

"io"

"log/slog"

"net/http"

"strconv"

"time"

gemini-code-assist · 2026-05-28T16:41:24Z

+		providedSig := r.Header.Get("X-Lark-Signature")
+		if expectedSig != providedSig {
+			h.log.Warn("lark webhook: signature mismatch")
+			http.Error(w, "bad request", http.StatusBadRequest)
+			return
+		}


Comparing cryptographic signatures using standard string comparison (!=) is vulnerable to timing attacks. Use subtle.ConstantTimeCompare to ensure constant-time comparison.

Suggested change

providedSig := r.Header.Get("X-Lark-Signature")

if expectedSig != providedSig {

h.log.Warn("lark webhook: signature mismatch")

http.Error(w, "bad request", http.StatusBadRequest)

return

}

providedSig := r.Header.Get("X-Lark-Signature")

if subtle.ConstantTimeCompare([]byte(expectedSig), []byte(providedSig)) != 1 {

h.log.Warn("lark webhook: signature mismatch")

http.Error(w, "bad request", http.StatusBadRequest)

return

}

Tom-Shuhong-Tang and others added 10 commits May 28, 2026 14:55

chore: update localstripe_demo submodule to 9fc10bc and ignore mock-l…

e6cf0d3

…ark binary Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chore: update localstripe_demo submodule to 9fc10bc and ignore mock-l…

b45a9a3

…ark binary Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

chore: update localstripe_demo to 3e1dfe0 (ignore gateway binary)

33737c3

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 28, 2026 16:39

Copilot started reviewing on behalf of henryqingmo May 28, 2026 16:39 View session

henryqingmo merged commit 30aa1b9 into main May 28, 2026
1 of 3 checks passed

gemini-code-assist Bot reviewed May 28, 2026

View reviewed changes

henryqingmo review requested due to automatic review settings May 28, 2026 17:03

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Replace Slack with Lark for approval notifications#10

Replace Slack with Lark for approval notifications#10
henryqingmo merged 10 commits into
mainfrom
slackToLark

henryqingmo commented May 28, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 28, 2026

Uh oh!

gemini-code-assist Bot May 28, 2026

Uh oh!

gemini-code-assist Bot May 28, 2026

Uh oh!

gemini-code-assist Bot May 28, 2026

Uh oh!

gemini-code-assist Bot May 28, 2026

Uh oh!

gemini-code-assist Bot May 28, 2026

Uh oh!

gemini-code-assist Bot May 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

henryqingmo commented May 28, 2026

Summary

Test plan

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants