Skip to content

feat(kanban): board webhooks — registration + HMAC outbound delivery#44

Merged
cnjack merged 4 commits into
mainfrom
feat/kanban-d2-webhooks
Jun 23, 2026
Merged

feat(kanban): board webhooks — registration + HMAC outbound delivery#44
cnjack merged 4 commits into
mainfrom
feat/kanban-d2-webhooks

Conversation

@cnjack

@cnjack cnjack commented Jun 22, 2026

Copy link
Copy Markdown
Owner

What

Implements D2 — register webhooks on the DB board and deliver kanban events to external URLs, signed with HMAC-SHA256. (next-features-design.md §3)

Changes

  • migration 0017_kanban_webhookskanban_webhooks (url, secret, event filter, board scope) + kanban_webhook_deliveries (queue + retry ledger). Coexists with B1's 0016.
  • handlers/kanban/webhook.rslist / create / delete (owner-admin only). Create returns the plaintext secret once (then a mask). HTTPS-only + SSRF guard (rejects loopback/private hosts). 20-per-workspace cap. enqueue_event queues a delivery for every enabled, subscribed webhook (board-scoped or all-boards).
  • card.rs — enqueue on create / move / archive (the key lifecycle events).
  • tasks/webhook_delivery.rs — worker that picks due deliveries, signs the JSON body (X-JType-Signature: sha256=…), POSTs via reqwest, and records the outcome with exponential backoff (dead after max_attempts).
  • hmac crate; routes; module registration.
  • api.ts + KanbanWebhook(Created) types; Kanban.tsx Webhooks dialog (list / create with event + scope pickers / one-time secret reveal / delete).
  • testswebhook_crud_and_enqueue (integration, MySQL) + HMAC known-vector unit test + hex_encode.

Verification

  • cargo check + 39 kanban_tests pass against MySQL: create / secret-once / non-https reject / list / enqueue-on-card-create / delete + the 0017 migration.
  • HMAC signing matches a known RFC-style vector (hmac_sha256_matches_known_vector).
  • root + web tsc clean.

Notes / follow-ups

  • Migration 0017 (B1 PR takes 0016).
  • Follow-ups: patch / rotate-secret / test / deliveries-history endpoints + UI; enqueue at patch/restore/delete sites; DNS-rebind hardening; realtime delivery-status events.
  • The worker's live HTTP POST is covered by code review + the HMAC unit test; an end-to-end POST assertion needs a receiver service (out of scope here).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features
    • Added Kanban webhook management (create, list, and delete) with event type filtering and optional board scoping.
    • Webhooks now trigger on card updates and card archive events.
    • Implemented signed outbound delivery with automatic retries, failure/dead tracking, and secret shown only at creation time.
  • Database
    • Added tables to store Kanban webhooks and delivery attempts.
  • Tests
    • Added an integration test covering webhook CRUD and delivery enqueueing.

Register webhooks on the DB board and deliver kanban events to external URLs,
signed with HMAC-SHA256. Implements D2 (next-features-design.md §3).

- migration 0017_kanban_webhooks: kanban_webhooks (url, secret, event filter,
  board scope) + kanban_webhook_deliveries (queue + retry ledger). Coexists with
  B1's 0016.
- handlers/kanban/webhook.rs: list / create / delete (owner-admin only). Create
  returns the plaintext secret ONCE (then a mask). HTTPS-only + SSRF guard
  (rejects loopback/private hosts). 20-per-workspace cap. `enqueue_event` queues
  a delivery for every enabled, subscribed webhook (board-scoped or all-boards).
- card.rs: enqueue on create / move / archive (the key lifecycle events).
- tasks/webhook_delivery.rs: worker that picks due deliveries, signs the JSON body
  (X-JType-Signature: sha256=…), POSTs via reqwest, and records the outcome with
  exponential backoff (dead after max_attempts). Spawned from run_from_env.
- hmac crate; routes; module registration.
- api.ts + KanbanWebhook(Created) types; Kanban.tsx Webhooks dialog (list /
  create with event + scope pickers / one-time secret reveal / delete).
- tests: webhook_crud_and_enqueue (integration, MySQL) + HMAC known-vector unit
  test + hex_encode.

Verified: cargo check + 39 kanban_tests pass against MySQL (create / secret-once /
non-https reject / list / enqueue-on-card-create / delete + the 0017 migration);
HMAC signing matches a known RFC-style vector; root + web tsc clean.

Notes: migration 0017 (B1 takes 0016). Follow-ups: patch / rotate-secret / test /
deliveries-history endpoints + UI; enqueue at patch/restore/delete sites; DNS-
rebind hardening; realtime delivery-status events. The worker's live HTTP POST is
covered by code + the HMAC unit test (an end-to-end POST needs a receiver).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 22, 2026

Copy link
Copy Markdown

Review Change Stack

Warning

Review limit reached

@cnjack, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 22 minutes and 18 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, the refill rate gradually slows as usage increases. The highest same-day bursts are limited more strictly.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c64ca5a7-8950-4940-b534-adbca840130c

📥 Commits

Reviewing files that changed from the base of the PR and between 85b1a6c and e18419e.

📒 Files selected for processing (7)
  • services/jtype-web/frontend/src/api.ts
  • services/jtype-web/frontend/src/pages/Kanban.tsx
  • services/jtype-web/src/db/migrations.rs
  • services/jtype-web/src/handlers/kanban/card.rs
  • services/jtype-web/src/handlers/kanban/mod.rs
  • services/jtype-web/src/lib.rs
  • services/jtype-web/tests/kanban_tests.rs
📝 Walkthrough

Walkthrough

This PR implements a complete Kanban webhook system: a new DB migration adds kanban_webhooks and kanban_webhook_deliveries tables; a new Rust handler module provides CRUD endpoints with HTTPS/SSRF validation and a best-effort delivery enqueue helper; card operations (create, move, archive) call enqueue to trigger deliveries; a background worker polls due deliveries, signs them with HMAC-SHA256, POSTs them with retry logic and exponential backoff; and a frontend WebhooksDialog exposes webhook management to users.

Changes

Kanban Webhook System

Layer / File(s) Summary
DB schema: webhooks and deliveries tables
services/jtype-web/migrations/0017_kanban_webhooks.up.sql, services/jtype-web/migrations/0017_kanban_webhooks.down.sql, services/jtype-web/src/db/migrations.rs
Up migration creates kanban_webhooks and kanban_webhook_deliveries with indexes and cascade foreign-key constraints; down migration drops both. Version 17 registered in all_migrations().
Webhook handler: models, validation, CRUD, and enqueue
services/jtype-web/src/handlers/kanban/webhook.rs, services/jtype-web/src/handlers/kanban/mod.rs
Defines KanbanWebhook, KanbanWebhookCreated, and CreateKanbanWebhookRequest structs; implements row_to_webhook deserialization, validate_target_url (HTTPS-only + SSRF/loopback rejection), and list_webhooks, create_webhook (count cap, secret generation), delete_webhook handlers (owner/admin auth). Adds best-effort enqueue_event with exact and wildcard event-type matching.
Card event triggers and app wiring
services/jtype-web/src/handlers/kanban/card.rs, services/jtype-web/src/lib.rs, services/jtype-web/src/tasks/mod.rs
create_card, move_card, and archive_card each call enqueue_event post-publish. Webhook list/create and delete routes registered in the router. Startup spawns the webhook_delivery task instead of cleanup_trash.
Background delivery worker with HMAC-SHA256 signing
services/jtype-web/src/tasks/webhook_delivery.rs, services/jtype-web/Cargo.toml
spawn runs a single-flight Tokio interval loop. run_once batches due deliveries, serializes payloads, signs via HMAC-SHA256 with hex_encode helper, POSTs to target URLs with signature headers, and updates rows with success state or capped exponential-backoff retry scheduling (failed/dead, attempt count, error tracking, next_retry_at). Unit tests cover hex zero-padding and a known HMAC-SHA256 vector.
Frontend webhook API types and methods
services/jtype-web/frontend/src/api.ts
Adds KanbanWebhook and KanbanWebhookCreated TypeScript interfaces; adds listWebhooks, createWebhook, deleteWebhook on api.kanban.
Frontend webhooks UI dialog and state management
services/jtype-web/frontend/src/pages/Kanban.tsx
Adds showWebhooks state, BoltIcon header button to open the dialog, and conditional WebhooksDialog render. The dialog component maintains form state (name, URL, event types, board scope), fetches/displays/creates/deletes webhooks, reveals plaintext secret once on creation, and shows errors. Defines WEBHOOK_EVENTS constant.
Integration test: webhook CRUD and delivery enqueue
services/jtype-web/tests/kanban_tests.rs
webhook_crud_and_enqueue asserts: creation returns secret/secretMasked fields, non-HTTPS and SSRF-style URLs rejected (private IPv4, loopback, internal hostnames), list count, delivery row enqueued on card create via database query, and 204 + cascade delete on webhook deletion.

Sequence Diagram(s)

sequenceDiagram
  rect rgba(135, 206, 235, 0.5)
    Note over User,Frontend: Webhook Management (Frontend)
    User->>Frontend: Click Webhooks (BoltIcon)
    Frontend->>api.kanban.listWebhooks: GET /kanban/webhooks
    api.kanban.listWebhooks-->>Frontend: KanbanWebhook[]
    User->>Frontend: Submit create form (name, URL, events, scope)
    Frontend->>api.kanban.createWebhook: POST /kanban/webhooks
  end

  rect rgba(144, 238, 144, 0.5)
    Note over create_webhook,MySQL: Backend Create & Enqueue
    create_webhook->>create_webhook: validate_target_url (HTTPS, SSRF)
    create_webhook->>MySQL: INSERT kanban_webhooks
    create_webhook-->>Frontend: KanbanWebhookCreated (plaintext secret)
    Note over card.rs,MySQL: Card operation triggers enqueue
    card.rs->>enqueue_event: card created/moved/archived
    enqueue_event->>MySQL: INSERT kanban_webhook_deliveries
  end

  rect rgba(255, 200, 100, 0.5)
    Note over webhook_delivery,TargetURL: Background Delivery Worker
    webhook_delivery->>MySQL: SELECT due deliveries
    webhook_delivery->>webhook_delivery: HMAC-SHA256 sign payload
    webhook_delivery->>TargetURL: POST JSON with signature headers
    alt 2xx
      webhook_delivery->>MySQL: UPDATE status=succeeded
    else error
      webhook_delivery->>MySQL: UPDATE status=failed/dead, schedule backoff
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Poem

🐇 A webhook hops from board to URL,
Signed with HMAC so secrets prevail,
Deliveries queue when a card takes a leap,
Retry on failure — no event shall sleep,
The Bolt icon gleams, the dialog appears,
A rabbit built webhooks — give three cheers! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 71.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main change: webhook registration and HMAC-based outbound delivery for kanban boards.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/kanban-d2-webhooks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (2)
services/jtype-web/migrations/0017_kanban_webhooks.up.sql (1)

39-39: 🚀 Performance & Scalability | 🔵 Trivial | ⚡ Quick win

Index shape misses the dequeue ordering column.

Line 39 indexes (status, next_retry_at), but the worker dequeues with ORDER BY created_at ASC LIMIT ...; this can force extra sorting as the queue grows. Include created_at in the same index for cheaper due-item polling.

Proposed migration tweak
-  KEY idx_deliveries_due (status, next_retry_at),
+  KEY idx_deliveries_due (status, next_retry_at, created_at),
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/jtype-web/migrations/0017_kanban_webhooks.up.sql` at line 39, The
index idx_deliveries_due on line 39 is defined as KEY idx_deliveries_due
(status, next_retry_at), but the worker dequeues records using ORDER BY
created_at ASC, which means the database cannot use the index for the ordering
operation and must perform additional sorting. Modify the index definition to
include created_at as the third column in the index to align with the dequeue
ordering, changing it to KEY idx_deliveries_due (status, next_retry_at,
created_at) so the index can efficiently support both the filter and the sort
operation during queue polling.
services/jtype-web/tests/kanban_tests.rs (1)

1631-1636: 🗄️ Data Integrity & Integration | 🔵 Trivial | ⚡ Quick win

Assert the cascade-delete side effect explicitly.

The test claims cascade removal of deliveries, but currently only checks 204. Add a post-delete DB assertion so FK cascade regressions are caught.

Proposed test addition
     let (status, _) = common::req(
         app, "DELETE", &format!("/api/v1/workspaces/{ws_id}/kanban/webhooks/{wh_id}"), Some(&token), None,
     ).await;
     assert_eq!(status, StatusCode::NO_CONTENT);
+
+    let remaining: i64 = sqlx::query_scalar(
+        "SELECT COUNT(*) FROM kanban_webhook_deliveries WHERE webhook_id = ?"
+    )
+    .bind(&wh_id)
+    .fetch_one(&pool)
+    .await
+    .unwrap();
+    assert_eq!(remaining, 0, "deliveries should be removed by ON DELETE CASCADE");
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/jtype-web/tests/kanban_tests.rs` around lines 1631 - 1636, The test
for webhook deletion only verifies the HTTP status code (204) but does not
assert that the cascade delete actually removed the associated deliveries from
the database. After the assert_eq call that checks the status code, add a
database query to verify that all deliveries associated with the deleted webhook
(identified by wh_id) have been removed. This ensures that FK cascade
constraints are working correctly and will catch any regressions in the cascade
delete behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@services/jtype-web/src/handlers/kanban/card.rs`:
- Around line 277-285: The webhook event enqueue via the enqueue_event function
occurs after the card mutation is committed, creating a non-atomic operation.
Wrap both the card update operation and the enqueue_event call within the same
database transaction so they commit together atomically. Apply this same
transactional pattern to all three locations where this issue occurs (the card
update, move, and archive paths) to ensure that if either the state change or
event enqueue fails, both are rolled back together.

In `@services/jtype-web/src/handlers/kanban/webhook.rs`:
- Around line 167-190: The webhook creation has a race condition where the COUNT
check on kanban_webhooks and the subsequent INSERT are not atomic. Two
concurrent requests can both pass the count validation and exceed the
MAX_WEBHOOKS_PER_WORKSPACE limit. Wrap both the sqlx::query_scalar count check
and the sqlx::query INSERT operation in a transaction using state.pool.begin()
to ensure atomicity, so that the count validation and insertion happen as a
single indivisible operation.
- Around line 99-113: The SSRF filter in the webhook handler uses string
manipulation to extract the host from the URL, which can be bypassed by URLs
with userinfo (like `https://user@127.0.0.1/...`) or IPv6 addresses that don't
match the prefix checks. Replace the manual string-splitting approach starting
with `let host = lower["https://".len()..]...split...next()` with proper URL
parsing (use a Rust URL parsing library) to extract the authority/host
component. Then replace the prefix-based blocked checks with proper host and IP
address classification that validates whether an address is actually internal,
handling both IPv4 (including proper CIDR ranges like 172.16.0.0/12) and IPv6
formats using dedicated IP parsing functions or libraries instead of string
prefix matching.

In `@services/jtype-web/src/tasks/webhook_delivery.rs`:
- Around line 54-67: The webhook delivery selection in the sqlx::query that
fetches rows from kanban_webhook_deliveries with status IN ('pending', 'failed')
lacks atomic ownership claiming, allowing multiple app instances to read and
process the same deliveries concurrently. Replace the current SELECT-only query
with an atomic operation that both selects pending/failed deliveries and claims
ownership (by updating them to a processing/claimed state) in a single database
transaction, preventing duplicate concurrent sends. Then ensure the subsequent
success/failure updates in the send logic only perform terminal-state updates
without re-incrementing attempt count.
- Around line 101-103: The UPDATE query for kanban_webhook_deliveries in the
success path does not reset the last_error field when marking a delivery as
succeeded, which leaves stale error information in the database that can mislead
operators and the UI. Modify the UPDATE query that sets status='succeeded',
attempt_count, last_status_code, and next_retry_at to also include clearing the
last_error field (set it to NULL or empty string) so that successful retries
have clean state with no lingering error messages.

---

Nitpick comments:
In `@services/jtype-web/migrations/0017_kanban_webhooks.up.sql`:
- Line 39: The index idx_deliveries_due on line 39 is defined as KEY
idx_deliveries_due (status, next_retry_at), but the worker dequeues records
using ORDER BY created_at ASC, which means the database cannot use the index for
the ordering operation and must perform additional sorting. Modify the index
definition to include created_at as the third column in the index to align with
the dequeue ordering, changing it to KEY idx_deliveries_due (status,
next_retry_at, created_at) so the index can efficiently support both the filter
and the sort operation during queue polling.

In `@services/jtype-web/tests/kanban_tests.rs`:
- Around line 1631-1636: The test for webhook deletion only verifies the HTTP
status code (204) but does not assert that the cascade delete actually removed
the associated deliveries from the database. After the assert_eq call that
checks the status code, add a database query to verify that all deliveries
associated with the deleted webhook (identified by wh_id) have been removed.
This ensures that FK cascade constraints are working correctly and will catch
any regressions in the cascade delete behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a066210d-2c94-4001-948c-720b8f961cad

📥 Commits

Reviewing files that changed from the base of the PR and between 5470b01 and c1ec124.

⛔ Files ignored due to path filters (1)
  • services/jtype-web/Cargo.lock is excluded by !**/*.lock
📒 Files selected for processing (13)
  • services/jtype-web/Cargo.toml
  • services/jtype-web/frontend/src/api.ts
  • services/jtype-web/frontend/src/pages/Kanban.tsx
  • services/jtype-web/migrations/0017_kanban_webhooks.down.sql
  • services/jtype-web/migrations/0017_kanban_webhooks.up.sql
  • services/jtype-web/src/db/migrations.rs
  • services/jtype-web/src/handlers/kanban/card.rs
  • services/jtype-web/src/handlers/kanban/mod.rs
  • services/jtype-web/src/handlers/kanban/webhook.rs
  • services/jtype-web/src/lib.rs
  • services/jtype-web/src/tasks/mod.rs
  • services/jtype-web/src/tasks/webhook_delivery.rs
  • services/jtype-web/tests/kanban_tests.rs

Comment on lines +277 to +285
super::webhook::enqueue_event(
&state.pool,
&workspace_id,
&board_id,
"kanban:card-updated",
json!({ "event": "kanban:card-updated", "cardId": card_id, "boardId": board_id }),
)
.await;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Webhook enqueue is non-atomic with the card mutation commit.

These paths commit the card change first, then enqueue webhook delivery rows. A crash/restart in that window persists the card mutation but drops the webhook event, breaking event consistency.

Suggested direction
-    tx.commit().await?;
-
-    super::webhook::enqueue_event(&state.pool, ...).await;
+    super::webhook::enqueue_event_tx(&mut tx, ...).await?;
+    tx.commit().await?;

Apply the same transactional outbox pattern to create/move/archive so state change and event enqueue succeed or fail together.

Also applies to: 733-741, 874-882

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/jtype-web/src/handlers/kanban/card.rs` around lines 277 - 285, The
webhook event enqueue via the enqueue_event function occurs after the card
mutation is committed, creating a non-atomic operation. Wrap both the card
update operation and the enqueue_event call within the same database transaction
so they commit together atomically. Apply this same transactional pattern to all
three locations where this issue occurs (the card update, move, and archive
paths) to ensure that if either the state change or event enqueue fails, both
are rolled back together.

Comment thread services/jtype-web/src/handlers/kanban/webhook.rs Outdated
Comment on lines +167 to +190
let count: i64 = sqlx::query_scalar("SELECT COUNT(*) FROM kanban_webhooks WHERE workspace_id = ?")
.bind(&workspace_id)
.fetch_one(&state.pool)
.await?;
if count >= MAX_WEBHOOKS_PER_WORKSPACE {
return Err(AppError::BadRequest("too many webhooks for this workspace".into()));
}

let id = Uuid::new_v4().to_string();
let secret = format!("{}{}", Uuid::new_v4().simple(), Uuid::new_v4().simple());
let events_json = serde_json::to_value(&events).unwrap_or(JsonValue::Array(vec![]));
sqlx::query(
"INSERT INTO kanban_webhooks (id, workspace_id, board_id, name, target_url, secret, event_types, created_by_user_id) VALUES (?, ?, ?, ?, ?, ?, ?, ?)",
)
.bind(&id)
.bind(&workspace_id)
.bind(&payload.board_id)
.bind(&name)
.bind(&target_url)
.bind(&secret)
.bind(&events_json)
.bind(&user.id)
.execute(&state.pool)
.await?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟠 Major | 🏗️ Heavy lift

Webhook cap check is non-atomic and can be exceeded concurrently.

Lines 167-190 do COUNT(*) then INSERT outside a lock/transaction boundary. Two concurrent creates can both pass the count and push a workspace over the 20-webhook limit. Enforce this atomically (transactional lock or DB-level invariant).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/jtype-web/src/handlers/kanban/webhook.rs` around lines 167 - 190,
The webhook creation has a race condition where the COUNT check on
kanban_webhooks and the subsequent INSERT are not atomic. Two concurrent
requests can both pass the count validation and exceed the
MAX_WEBHOOKS_PER_WORKSPACE limit. Wrap both the sqlx::query_scalar count check
and the sqlx::query INSERT operation in a transaction using state.pool.begin()
to ensure atomicity, so that the count validation and insertion happen as a
single indivisible operation.

Comment on lines +54 to +67
let rows = sqlx::query(
r#"SELECT d.id, d.webhook_id, d.event_type, d.payload, d.attempt_count, d.max_attempts,
w.target_url, w.secret
FROM kanban_webhook_deliveries d
JOIN kanban_webhooks w ON w.id = d.webhook_id
WHERE d.status IN ('pending', 'failed')
AND (d.next_retry_at IS NULL OR d.next_retry_at <= NOW())
ORDER BY d.created_at ASC
LIMIT ?"#,
)
.bind(BATCH)
.fetch_all(pool)
.await?;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🩺 Stability & Availability | 🟠 Major | 🏗️ Heavy lift

Due deliveries are not atomically claimed before sending.

Multiple app instances can read the same pending/failed rows and POST duplicates concurrently because selection and ownership claim are separate (and claim happens effectively after send). This can trigger duplicate downstream side effects and race attempt/status accounting.

Suggested direction
+        // Claim row ownership before POST (compare-and-set)
+        let claim = sqlx::query(
+            "UPDATE kanban_webhook_deliveries
+             SET attempt_count = attempt_count + 1
+             WHERE id = ?
+               AND attempt_count = ?
+               AND status IN ('pending','failed')
+               AND (next_retry_at IS NULL OR next_retry_at <= NOW())",
+        )
+        .bind(&id)
+        .bind(prev_attempts)
+        .execute(pool)
+        .await?;
+        if claim.rows_affected() == 0 {
+            continue; // another worker claimed it
+        }
+
-        let attempt = prev_attempts + 1;
+        let attempt = prev_attempts + 1;

Then keep success/failure updates as terminal-state updates only (without re-incrementing attempt count again).

Also applies to: 87-95, 97-132

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/jtype-web/src/tasks/webhook_delivery.rs` around lines 54 - 67, The
webhook delivery selection in the sqlx::query that fetches rows from
kanban_webhook_deliveries with status IN ('pending', 'failed') lacks atomic
ownership claiming, allowing multiple app instances to read and process the same
deliveries concurrently. Replace the current SELECT-only query with an atomic
operation that both selects pending/failed deliveries and claims ownership (by
updating them to a processing/claimed state) in a single database transaction,
preventing duplicate concurrent sends. Then ensure the subsequent
success/failure updates in the send logic only perform terminal-state updates
without re-incrementing attempt count.

Comment on lines +101 to +103
sqlx::query(
"UPDATE kanban_webhook_deliveries SET status='succeeded', attempt_count=?, last_status_code=?, next_retry_at=NULL WHERE id=?",
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🗄️ Data Integrity & Integration | 🟡 Minor | ⚡ Quick win

Clear stale error details on successful retry.

A recovered delivery keeps old last_error text because success updates do not reset it, which leaves misleading state for operators/UI.

Proposed fix
- "UPDATE kanban_webhook_deliveries SET status='succeeded', attempt_count=?, last_status_code=?, next_retry_at=NULL WHERE id=?",
+ "UPDATE kanban_webhook_deliveries SET status='succeeded', attempt_count=?, last_status_code=?, last_error=NULL, next_retry_at=NULL WHERE id=?",
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
sqlx::query(
"UPDATE kanban_webhook_deliveries SET status='succeeded', attempt_count=?, last_status_code=?, next_retry_at=NULL WHERE id=?",
)
sqlx::query(
"UPDATE kanban_webhook_deliveries SET status='succeeded', attempt_count=?, last_status_code=?, last_error=NULL, next_retry_at=NULL WHERE id=?",
)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@services/jtype-web/src/tasks/webhook_delivery.rs` around lines 101 - 103, The
UPDATE query for kanban_webhook_deliveries in the success path does not reset
the last_error field when marking a delivery as succeeded, which leaves stale
error information in the database that can mislead operators and the UI. Modify
the UPDATE query that sets status='succeeded', attempt_count, last_status_code,
and next_retry_at to also include clearing the last_error field (set it to NULL
or empty string) so that successful retries have clean state with no lingering
error messages.

cnjack and others added 2 commits June 23, 2026 11:55
- validate_target_url now parses the URL with the `url` crate and classifies the
  host structurally, so `https://user@127.0.0.1/`, private/link-local IPv4,
  IPv6 loopback/ULA, and `*.internal`/`*.local` can no longer slip past the old
  prefix check. IP literals are classified via std::net (is_private/is_loopback/…).
- delivery worker: reqwest client uses redirect Policy::none() so a 3xx to an
  internal host can't bypass the create-time validation.
- test: webhook_crud_and_enqueue now asserts rejection of userinfo, private-IP,
  IPv6-loopback and .internal targets.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
# Conflicts:
#	services/jtype-web/frontend/src/api.ts
#	services/jtype-web/src/db/migrations.rs
#	services/jtype-web/src/lib.rs
#	services/jtype-web/tests/kanban_tests.rs
@cnjack cnjack merged commit b089b6d into main Jun 23, 2026
3 checks passed
@cnjack cnjack deleted the feat/kanban-d2-webhooks branch June 23, 2026 04:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant