feat(expert): human-in-the-loop tool permissions (#421) by andypalmi · Pull Request #7639 · FlowFuse/flowfuse

andypalmi · 2026-06-30T13:44:49Z

Human-in-the-loop tool permissions for the Expert

Implements per-tool human-in-the-loop permissions for the Expert's flow-building tools, in the immersive editor, as described in FlowFuse/product#421. The builder (and their team role) controls which flow-building actions the Expert may run, which need approval, and which are off limits, so it never makes a change they would not have allowed.

Stacked on #7635 (feat/408-expert-plan-mode), which is the base of this PR and should merge first.

What it does

Inline approval card in chat when a tool's policy is Ask: friendly tool name, action type (Read / Write / Delete) and the concrete call parameters as prettified JSON, with Allow / Always allow / Deny / Always deny. "Always allow" and "Always deny" apply for the rest of the current chat and reset on Start Over and on refresh; after a choice the card shows exactly what was picked and collapses the payload. The agent pauses on the round-trip with no session timeout, however long the user takes; the chat stop button cancels it (treated as denied).
Per-team settings (in the Expert settings dialog): permissions are saved per team. Each tool group (flow-building, platform) has its own always-visible default permissions — a per-action-type default (Always allow / Ask / Always deny) for read, write and delete — and collapses its individual-tool overrides behind an accordion. The policy control is a fast three-button toggle rather than a dropdown.
Per-tool overrides that stay put: a permission set on an individual tool overrides its type default and keeps that setting until reset. Each type default shows an "N set individually" count of the tools in that scope carrying their own saved permission, and a Reset action beside it returns those tools to the default. Session-only grants are excluded from the count — they appear per-tool and reset on their own.
Make permanent: a tool granted only for the current chat shows a "Make permanent" action to save that choice for the team.
Section ordering by context: flow-building tools lead in the immersive editor; platform tools lead in the app.
Role inheritance, fail-closed: read-only team members cannot enable or trigger write/delete tools and see why; the agent also fails closed server-side.
Version gating: each tool carries a min/max nr-assistant version. Tools render as available, "update required", or deprecated against the instance's nr-assistant version. Versioned variants (e.g. Manage Groups v1/v2) collapse into one row and resolve to the in-range variant, nudging an update to the newest variant's min version when behind.

Architecture

The agent decides policy at the toolsNode seam (sibling of the plan-mode gate): role check first, then per-tool policy. allow runs, deny feeds the denial back to the model so it adapts and explains, ask publishes expert:tool-approval and awaits the browser's decision.
The flow-building tool catalog is served over the agent's GET /mcp/flow-tools endpoint (friendly name + scope + version window only). Forge exposes GET /api/v1/expert/mcp/tools, which proxies that endpoint and returns the merged catalog: FlowFuse platform tools are curated into the same array (tagged as a platform group) — wired and commented out until the platform-tool work is merged, at which point it's a one-line switch. Every chat response carries a hash of the flow-building catalog; the browser refetches only when the hash diverges, so it stays correct across rolling deploys where instances can be on different versions.
Saved per-team choices and per-chat session grants live in the existing product-assistant / product-expert Pinia stores.

UI

The settings panel follows existing FlowFuse patterns: FormHeading for section titles, ff-data-table for the defaults and tool lists, ff-accordion to collapse the per-tool detail, and the shared three-button toggle for each policy control, so each tool name lines up with its own control across the row border.

Out of scope (follow-ups)

Admin-configurable team-wide default policy (DB migration + admin UI + server enforcement).
Enabling platform (non-flow-building) tools in the catalog, once the platform-tool work is merged into the agent. The forge endpoint has the curation wired and commented, ready to switch on.

Testing

Build + color/eslint lint green.
Automated unit tests:
- forge/ee/routes/expert/index_spec.js (new MCP tools Endpoint block): GET /mcp/tools auth (401 for instance/device tokens), team-access (404 for non-members), missing teamId (400 from the querystring schema), the flow-tools catalog + hash proxy (asserting the upstream /mcp/flow-tools URL and service token), the empty-response defaults (catalog: [], hash: null), and upstream error-status propagation.
- frontend/src/stores/product-assistant.spec.js (new tool-permissions block): the permission-resolution engine, i.e. the classOf/groupOf helpers, per-team class defaults, saved vs. session policy resolution, resolvedToolPermissions, version gating (toolAvailabilityFor), catalog/preference/override mutations, resetGroupClassPreferences, promoteSessionOverride, and the pending-approval registry.
- frontend/src/stores/product-expert-tool-permissions.spec.js (new): catalog fetch (success / no-team / error) and the approval round-trip (session short-circuit, resolve, always-allow, always-deny, cancel).
Manual: catalog populates on opening the immersive editor; Ask shows the card and pauses with no timeout (Allow applies to canvas, Deny explains gracefully); "Always allow"/"Always deny" apply for the chat and reset on Start Over / refresh; "Make permanent" saves the choice per team; a per-tool override holds against changing the type default, the "N set individually" count reflects it, and Reset returns those tools to the default; per-team settings stay separate across team switches and survive reload; read-only role sees write/delete disabled with a reason and cannot trigger them; leaving immersive hides the panel and stops sending permissions; chat stop while a card is open recovers cleanly.

Requires matching agent-side changes.

Refs FlowFuse/product#421

Screenshots

Expert Settings with Tools permissions

Counter of how many tools have different permissions than their scope's permission

Permissions reset and change behaviour per scope

Screen.Recording.2026-07-01.at.17.24.46.mov

Option to save a permission set for the current session

Screen.Recording.2026-07-01.at.17.32.09.mov

Approval Cards

Screen.Recording.2026-07-01.at.17.30.45.mov

…ion in MCP server tests

Add per-tool approval for the Expert's flow-building tools in the immersive editor. The agent gates each tool call at the toolsNode seam by class (read/write/delete) and per-tool preference; write/delete default to Ask and surface an inline approval card (Allow / Always allow / Never) that holds the call open with no session timeout, while read defaults to allow. - Catalog delivered over HTTP (GET /api/v1/expert/mcp/tools), curated to friendly names so raw tool identifiers never reach the browser; a per-response hash triggers a background refetch when the catalog drifts. - HITL state consolidated into the product-assistant store (defaults, per-tool preferences, pending-approval map) with SemVer version gating. - Settings panel groups versioned tool variants into one family and points update hints at the newest variant's required version. - Role inheritance is fail-closed: read-only members cannot enable or trigger write/delete tools and are shown why.

Use FormHeading for the section titles and ff-data-table for both the action-type defaults and the flow-building tool list, replacing the bespoke section/group styling and the non-standard uppercase scope headers. Bordered table rows pair each tool with its permission control across the row rather than leaving them to float across whitespace; tool scope moves into a Type column. The approval card no longer sends or renders a tool summary; the tool name, scope and call parameters describe the action.

codecov · 2026-06-30T14:04:29Z

Codecov Report

❌ Patch coverage is 91.66667% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 76.59%. Comparing base (e857547) to head (50de406).

Files with missing lines	Patch %	Lines
forge/ee/routes/expert/index.js	91.66%	1 Missing ⚠️

Additional details and impacted files

@@                    Coverage Diff                     @@
##           feat/408-expert-plan-mode    #7639   +/-   ##
==========================================================
  Coverage                      76.58%   76.59%           
==========================================================
  Files                            413      413           
  Lines                          21849    21856    +7     
  Branches                        5760     5763    +3     
==========================================================
+ Hits                           16733    16740    +7     
  Misses                          5116     5116

Flag	Coverage Δ
backend	`76.59% <91.66%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

…nd platform tools Fetch the tool catalog when the Expert panel mounts (not only in the editor) so the permissions settings render wherever the Expert is. Split the settings into a Flow Building Tools section, with its own per-action-type default permissions, and a separate FlowFuse Platform Tools section (a placeholder until those tools ship, with TODOs marking where they get mapped in). Flow-building tools are listed everywhere but noted as usable only from an instance editor.

- Show plain Read / Write / Delete scope instead of phrases like Read only - Stop the Setup Guide badge rendering above the approval card - Disable the action buttons as soon as a choice is made

…rmissions

…rmissions # Conflicts: # frontend/src/components/expert/components/ExpertChatInput.vue

Raise the conversation-history expiry from 28 to 30 minutes (warning at 27), so the human-in-the-loop tool-approval wait, which is bounded by the session lifetime, has the full 30-minute window the agent now allows.

Revert the 30-minute expiry back to 28 (warning at 25). The agent clears old transactions/context at 30 minutes, so the chat must expire a moment earlier to avoid referencing backend history that has already been purged. The tool-approval wait is bounded by this 28-minute session lifetime.

Replace the flat key/value list on the approval card with a prettified JSON view of the call payload. Adds a small single-value JsonViewer that reuses the prettify + word-wrap + horizontal-scroll presentation of the snapshot comparison diff panel, without its two-sided diff machinery. The payload is prettified by default; an ff-button Wrap toggle appears for long lines and reflects its on/off state rather than changing its label.

Harden the JSON payload viewer against malformed input and collapse the payload once a decision is made. - JsonViewer stringify can no longer throw: circular refs, BigInt and any other non-serialisable value fall back to a circular-safe pass, then to a plain coercion, so a bad payload never breaks the approval card. - Add a live collapse toggle to JsonViewer (collapsible + defaultCollapsed). The header caret expands/collapses at any time; the parent can seed the initial state. - ToolApprovalCard collapses the payload once the call is allowed, always allowed or denied (local decision or round-tripped status), while leaving the toggle live so the user can re-expand it.

Drop the circular-safe/BigInt fallback machinery from the payload viewer. Tool-call params are plain JSON; if they somehow can't be serialised, show a simple 'Could not display the payload.' message rather than placeholder markers.

- Replace the unicode caret on the JSON payload collapse toggle with the standard rotating ChevronRightIcon (matches ToolCallItem section headers). - Add a 'bare' prop to MessageBubble that strips the bubble background and padding, and use it for tool-approval answers so the approval card renders as a standalone card instead of a card nested inside an AI bubble.

Co-authored-by: ppawlowski <piotr@flowfuse.com>

…rmissions # Conflicts: # frontend/src/components/expert/components/ExpertChatInput.vue # frontend/src/components/expert/components/messages/components/AnswerWrapper.vue

…ttings redesign (#421) - Scope saved tool permissions per team (defaults and per-tool overrides), replacing the single global store. - Make approval-card 'Always allow' / 'Always deny' apply to the current chat only: they reset on Start Over and refresh, and the card shows exactly what was chosen. Add an 'Always deny' button. A session grant can be made permanent from the settings dialog. - Redesign the settings panel: always-visible read/write/delete defaults, the per-tool list collapsed in an accordion, and a three-button toggle in place of the dropdown ('Always allow' / 'Ask' / 'Always deny'). Order flow-building first in the editor and platform first in the app. Surface any session grant with a 'Make permanent' action. - Add a size prop to ToggleButtonGroup so dense contexts set button size instead of reaching into it with :deep. - Rename the agent flow-tools endpoint to /mcp/flow-tools; the forge route returns the merged catalog and is prepared to fold in platform tools (wired and disabled). - Match the approval card width to the tool-calls summary strip.

…e sizing - Split read/write/delete defaults per tool group (flow-building and platform) - Honour a session Always allow/deny without re-prompting later in the same loop - Show the pressed decision on the approval card immediately - Fix policy toggle label overflow and widen the settings dialog

…ly-set tools Add a Reset action beside each read/write/delete class default that clears the saved per-tool preferences for that scope, so tools detached from the default can be returned to it. A "N set individually" counter shows how many tools in the scope carry their own saved permission (session-only grants are excluded — they already appear per-tool and reset on their own). The intro now documents that a per-tool setting overrides its type default until reset.

…are prefix of offered options (#7646)

Co-authored-by: Costin Serban <cstn.serban@gmail.com>

n-lark · 2026-07-01T18:14:35Z

+            for (const m of this._agentStore.messages) {
+                if (!Array.isArray(m.answer)) continue
+                for (const a of m.answer) {
+                    if (a.kind === 'tool-approval' && a.status === 'pending') a.status = 'denied'


Couldn't test this without the agent side, but reading the code: on Stop, cancelPendingToolApprovals sets status='denied' on the store answer, but the card renders a shallow copy of it useStreamingList({ shallow: true }), so its status prop never update. Worth confirming, but looks like Stop won't resolve an open approval card.

Good catch, confirmed. The card renders a detached streaming copy of the answer (AiMessage uses useStreamingList with shallow: true), so writing the status onto the store message never reached it. On Stop the buttons stayed live.

Fixed by recording the outcome in a reactive per-id map (toolApprovalStatuses) on the product-assistant store. AnswerWrapper now feeds the card its status from that map, so an external resolution (Stop / Start Over) updates a card the user never pressed. localStatus stays for instant feedback on the user's own press. Added store and product-expert tests covering the denied-on-cancel path.

n-lark · 2026-07-01T18:16:32Z


+            // A new chat drops the per-session tool grants ("Always allow/deny for this chat").
+            useProductAssistantStore().clearSessionToolOverrides()
+


Do we want a cancelPendingToolApprovals() here too?

Yes. startOver now calls cancelPendingToolApprovals() first, so any approval still awaiting a decision resolves (as denied) and the agent's paused tool call unblocks instead of hanging on a message we are about to drop. It also clears toolApprovalStatuses alongside the session overrides.

n-lark

Hey so I cannot test the approve/deny part of this in that chat due to the staging env not having posthog synced up. The permissions page under settings UI looks fine to me but I don't feel comfortable approving this since I cannot test and am unfamiliar with this feature. I'd recommend @cstns or @Steve-Mcl to takes a look.

…421) Add automated coverage for the human-in-the-loop tool-permission work: - forge GET /mcp/tools: auth (401 instance/device), team-access (404), missing teamId (400), catalog+hash proxy, empty-response defaults and upstream error propagation. - product-assistant store: permission-resolution engine (class/group helpers, per-team defaults, per-tool and session overrides, resolved permissions, version gating, and the pending-approval registry). - product-expert store: catalog fetch and the approval round-trip (session short-circuit, resolve/always-allow/always-deny, cancel).

The approval card renders a detached streaming copy of its answer, so writing a resolved status onto the store message never reached it. On chat stop the card stayed on its Allow/Deny buttons even though the pending call had been denied. Record approval outcomes in a reactive per-id map on the product-assistant store and have AnswerWrapper feed the card its status from that map, so an external resolution (chat stop / Start Over) updates a card the user never pressed. Start Over now also cancels open approvals and clears the map.

andypalmi · 2026-07-01T18:57:54Z

Thanks for the review. Pushed a fix for the Stop issue you spotted.

Root cause: the approval card renders a detached streaming copy of its answer (AiMessage uses useStreamingList with shallow: true), so a status written onto the store message never reached the card. Clicking Allow/Deny worked only because the card tracks its own localStatus; the external Stop path had no way in, so the buttons stayed live.

Fix: approval outcomes are now recorded in a reactive per-id map (toolApprovalStatuses) on the product-assistant store, and AnswerWrapper feeds the card its status from that map. That covers external resolutions, Stop and Start Over, on a card the user never pressed. Start Over also cancels open approvals first so the paused tool call unblocks. This is in-memory session state only, not persisted, same lifecycle as the session overrides. Added store and product-expert tests for the denied-on-cancel path.

On testing: understood you cannot exercise approve/deny on staging without the agent side synced. @cstns or @Steve-Mcl, a second look would be welcome given the reactivity change.

#7598) Co-authored-by: Steve-Mcl <sdmclaughlin@gmail.com> Co-authored-by: Stephen McLaughlin <44235289+Steve-Mcl@users.noreply.github.com> Co-authored-by: Andrea Palmieri <76187074+andypalmi@users.noreply.github.com> Co-authored-by: andypalmi <andrea@flowfuse.com>

…omationsHandler integration # Conflicts: # frontend/src/stores/context.js

…421) Curate the FlowFuse platform automation tools from the handler singleton (app.comms.platformAutomation) into the /mcp/tools catalog alongside the flow-building tools, tagged group:'platform' so the UI routes them to their own section with their own read/write/delete defaults. Read/write/delete class is derived from each tool's MCP annotations; platform tools carry no nr-assistant version window.

Replace the mid-turn approval round-trip with a stateless defer/resume flow so the agent never stays resident waiting on a human. When a turn needs approval it returns the approval card(s) and ends; the browser collects the decisions and sends them back in one resume message that continues the turn. - product-expert: track the open approval batch, resume once every card is answered, transport-agnostic (MQTT push or awaited HTTP reply). - product-assistant: drop the promise-based pending-approval registry; the store now only records per-card outcome statuses and session grants. - tests updated for the batch model.

cstns and others added 6 commits June 22, 2026 19:28

Implement MCP server routes, feature flags, and related tests

b7023ce

fix failing tests by replacing null with empty string for PAT creat…

c413c16

…ion in MCP server tests

fix failing unit test

6b598c8

Merge remote-tracking branch 'origin/main' into 7426_mcp-scaffolding

1ec0bb6

andypalmi force-pushed the feat/421-expert-tool-permissions branch from 8a97b8e to bdb36fb Compare June 30, 2026 13:46

andypalmi temporarily deployed to staging June 30, 2026 13:57 — with GitHub Actions Inactive

andypalmi temporarily deployed to staging June 30, 2026 14:11 — with GitHub Actions Inactive

fix(expert): tidy tool approval card

ff6ee80

- Show plain Read / Write / Delete scope instead of phrases like Read only - Stop the Setup Guide badge rendering above the approval card - Disable the action buttons as soon as a choice is made

andypalmi temporarily deployed to staging June 30, 2026 14:20 — with GitHub Actions Inactive

andypalmi added 2 commits June 30, 2026 16:35

Merge branch 'feat/408-expert-plan-mode' into feat/421-expert-tool-pe…

3368791

…rmissions

Merge branch 'feat/408-expert-plan-mode' into feat/421-expert-tool-pe…

438f60c

…rmissions # Conflicts: # frontend/src/components/expert/components/ExpertChatInput.vue

andypalmi temporarily deployed to staging June 30, 2026 15:25 — with GitHub Actions Inactive

fix(expert): extend chat session lifetime to 30 minutes

dd6366c

Raise the conversation-history expiry from 28 to 30 minutes (warning at 27), so the human-in-the-loop tool-approval wait, which is bounded by the session lifetime, has the full 30-minute window the agent now allows.

andypalmi temporarily deployed to staging June 30, 2026 15:38 — with GitHub Actions Inactive

andypalmi temporarily deployed to staging June 30, 2026 22:07 — with GitHub Actions Inactive

andypalmi temporarily deployed to staging June 30, 2026 22:28 — with GitHub Actions Inactive

andypalmi added 3 commits July 1, 2026 00:31

andypalmi temporarily deployed to staging June 30, 2026 22:40 — with GitHub Actions Inactive

cstns and others added 2 commits July 1, 2026 13:31

Merge branch 'main' into 7426_mcp-scaffolding

3afa8c2

Fix RBAC permission check for inflight messages (#7641)

503263b

andypalmi mentioned this pull request Jul 1, 2026

feat(expert): plan mode — propose a plan before acting (#408, #409) #7635

Open

docs: Fix team type names on the Static asset service page (#7643)

66f87ec

Co-authored-by: ppawlowski <piotr@flowfuse.com>

Merge branch 'feat/408-expert-plan-mode' into feat/421-expert-tool-pe…

6ffa250

…rmissions # Conflicts: # frontend/src/components/expert/components/ExpertChatInput.vue # frontend/src/components/expert/components/messages/components/AnswerWrapper.vue

andypalmi requested a review from n-lark July 1, 2026 11:37

andypalmi temporarily deployed to staging July 1, 2026 11:40 — with GitHub Actions Inactive

docs(expert): trim redundant tool-permission comments

1b7ed4d

andypalmi temporarily deployed to staging July 1, 2026 11:48 — with GitHub Actions Inactive

andypalmi temporarily deployed to staging July 1, 2026 13:56 — with GitHub Actions Inactive

andypalmi temporarily deployed to staging July 1, 2026 14:59 — with GitHub Actions Inactive

andypalmi temporarily deployed to staging July 1, 2026 15:22 — with GitHub Actions Inactive

andypalmi self-assigned this Jul 1, 2026

n-lark and others added 4 commits July 1, 2026 08:37

[7644] Dropdowns that allow manual input will not accept inputs that …

c70a5f5

…are prefix of offered options (#7646)

[7441] Consolidate broker credential generation (#7642)

88f7582

Co-authored-by: Costin Serban <cstn.serban@gmail.com>

Insights for devices and self hosted platforms (#7604)

0c9b327

Merge branch 'main' into 7426_mcp-scaffolding

484652e

n-lark reviewed Jul 1, 2026

View reviewed changes

andypalmi temporarily deployed to staging July 1, 2026 18:50 — with GitHub Actions Inactive

andypalmi temporarily deployed to staging July 1, 2026 19:00 — with GitHub Actions Inactive

cstns and others added 4 commits July 1, 2026 22:53

merge: bring MCP scaffolding (#7596) into HITL branch for PlatformAut…

97cfb22

…omationsHandler integration # Conflicts: # frontend/src/stores/context.js

andypalmi deployed to staging July 1, 2026 22:08 — with GitHub Actions Active

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(expert): human-in-the-loop tool permissions (#421)#7639

feat(expert): human-in-the-loop tool permissions (#421)#7639
andypalmi wants to merge 34 commits into
feat/408-expert-plan-modefrom
feat/421-expert-tool-permissions

andypalmi commented Jun 30, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

n-lark Jul 1, 2026

Uh oh!

andypalmi Jul 1, 2026

Uh oh!

n-lark Jul 1, 2026

Uh oh!

andypalmi Jul 1, 2026

Uh oh!

n-lark left a comment

Uh oh!

andypalmi commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants


		// A new chat drops the per-session tool grants ("Always allow/deny for this chat").
		useProductAssistantStore().clearSessionToolOverrides()

Uh oh!

Conversation

andypalmi commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Human-in-the-loop tool permissions for the Expert

What it does

Architecture

UI

Out of scope (follow-ups)

Testing

Screenshots

Expert Settings with Tools permissions

Counter of how many tools have different permissions than their scope's permission

Permissions reset and change behaviour per scope

Option to save a permission set for the current session

Approval Cards

Uh oh!

codecov Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

n-lark Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

andypalmi Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

n-lark Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

andypalmi Jul 1, 2026

Choose a reason for hiding this comment

Uh oh!

n-lark left a comment

Choose a reason for hiding this comment

Uh oh!

andypalmi commented Jul 1, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

andypalmi commented Jun 30, 2026 •

edited

Loading

codecov Bot commented Jun 30, 2026 •

edited

Loading