Skip to content

@W-23049417: Add stale-content-cleanup-apply prompt and HITL confirm primitive#405

Open
Alon-ST-DATA wants to merge 5 commits into
mainfrom
asimantov/w-23049417-add-stale-content-cleanup-apply-prompt
Open

@W-23049417: Add stale-content-cleanup-apply prompt and HITL confirm primitive#405
Alon-ST-DATA wants to merge 5 commits into
mainfrom
asimantov/w-23049417-add-stale-content-cleanup-apply-prompt

Conversation

@Alon-ST-DATA

@Alon-ST-DATA Alon-ST-DATA commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

W-23049417 (2B-5): adds the destructive Apply phase of Stale Content Cleanup as an admin-gated prompt that orchestrates report -> resolve LUIDs -> tag -> notify -> human-in-the-loop gate -> grace -> confirmed delete, with dryRun defaulting to true.

W-23050036 (2B-6): adds src/prompts/_lib/confirm.ts, the first shared _lib helper. Content-agnostic text builders renderHitlGate() and renderConfirmInstructions(). MCP SDK v1.x has no runtime elicitation, so HITL is enforced as a prompt-text contract.

Also extracts the duplicated owner-email lookup into src/tools/web/users/resolveOwnerEmail.ts, now shared by deleteWorkbook.ts and deleteDatasource.ts, and adds prompt docs pages.

IMPORTANT: Please do not create a Pull Request without creating an issue first.

Any change needs to be discussed before proceeding. Failure to do so may result in the rejection of
the pull request.

Pull Request Template

Description

Motivation and Context

Type of Change

  • Bug fix
  • New feature
  • Breaking change
  • Documentation update
  • Other (please describe):

How Has This Been Tested?

Related Issues

Checklist

  • I have updated the version in the package.json file by using npm run version. For example,
    use npm run version:patch for a patch version bump.
  • I have made any necessary changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • I have documented any breaking changes in the PR description. For example, renaming a config
    environment variable or changing its default value.

E2E Test Report — stale-content-cleanup-apply MCP Prompt

Date: 2026-06-17 (F1 fix verified 2026-06-18)
Tester: manual e2e (Claude Code)
Work item: W-23049417
Verdict: ✅ PASS (5/5 dry-run steps) — F1 footgun FIXED & re-verified; 2 minor/carried-over findings remain


1. Scope & Subject

Field Value
Under test MCP prompt stale-content-cleanup-apply (not a tool — a server-registered workflow)
Orchestrates get-stale-content-reportlist-* (LUID resolve) → two-phase delete-*
Server build ~/dev/git_repos/main/tableau-mcp, branch asimantov/w-23049417-add-stale-content-cleanup-apply-prompt, HEAD d85c0ad, v2.16.0
Prompt source src/prompts/staleContent/apply.ts (193 lines) — verified verbatim == client-received text
Site / env admin-insights-tmcp-site @ 10ax; REST creds 10axtestsite.env.list (contentUrl=SITE_NAME)
dryRun default true (args.dryRun !== 'false')

2. Test Approach

Live get-stale-content-report returned 3376 real stale items (~330 GB) owned by many real users. Running the workflow verbatim would Step-3 tag all 3376 real items pending-deletion (a real write) before the Step-5 human gate.

Decision: declined verbatim run; tested against 2 self-owned disposable fixtures (1 workbook + 1 datasource → exercises both routing branches). Fixtures published via REST multipart/mixed into throwaway project zz_stale_e2e_disposable, torn down after.

All assertions verified against independent REST ground truth, never the tool's own claims.


3. Step-by-Step Results

Step Action Result
1 — Report (read-only) get-stale-content-report called once, minAgeDays 90, itemTypes [Workbook, Datasource] ✅ PASS — 3376 rows; schema matches doc (itemId/itemType/itemName/project/ownerEmail/daysSinceLastUse/size/neverAccessed). No mutation.
2 — Resolve LUIDs routing listTool + filter name:eq:<n>,projectName:eq:<proj> ✅ PASS — both fixtures → correct LUID; owner.id present.
3 — Tag (preview) each deleteTool, confirm omitted, tag:"pending-deletion" ✅ PASS — per-item tokens returned (WB 2daa4a632b59, DS 0efb7f164243). REST GT confirms both tagged AND still exist (200). DS preview ran downstream-dep check. Owner resolved to email in message.
4 — Notify (report-only) table grouped by owner ✅ PASS — owner emails present from report/preview; no email sent.
5 — STOP (dry run) present items + notify table, no confirm:true ✅ PASS — stopped; zero delete tool called with confirm:true.

Fixtures used:

Type Name LUID Owner
Workbook stale_e2e_wb_happy 15d8b14a-4f9f-45a9-831a-8edf10582bf0 asimantov@tableau.com
Datasource stale_e2e_ds_happy ed8d979c-433a-40c9-8e86-a41b59bb8209 asimantov@tableau.com

REST tag verification: workbook (200, ['pending-deletion']) · datasource (200, ['pending-deletion'])


4. Findings

✅ F1 — Unguarded mass-tag before human gate (design footgun) — FIXED

Original (HEAD d85c0ad): Step 3 tagged every resolved report row pending-deletion (a real write) before the Step-5 human gate, with no row cap.

  • HITL gate (renderHitlGate) guarded the DELETE only, not the TAG.
  • dryRun blocked the confirmed-delete only — it did NOT prevent tagging.
  • On the live report this = 3376 unapproved tag-writes on other users' production content; the "safe rehearsal" dry run still wrote.

Fix (commit 30ffa13, branch asimantov/w-23049417-add-stale-content-cleanup-apply-prompt): moved the human gate ahead of any write.

  • Steps 1–3 (report → resolve LUIDs → notify) are now strictly read-only.
  • The confirmation break (now Step 4) precedes tagging; tagging (Step 5) runs only in the dryRun:false branch and only for items the user explicitly approved.
  • dryRun (default true) now writes nothing at all — no tag, no delete. The rehearsal is genuinely safe.
  • Added a 100-row guard (LARGE_REPORT_THRESHOLD): oversized reports stop and ask the user to narrow scope rather than tag/delete an unreviewed batch.
  • HITL gate verb widened deletetag or delete.

Re-verification (2026-06-18): prompt unit 16/16 pass (+4 F1 tests); stale-content e2e 9/9 pass live against admin-insights-tmcp-site @ 10ax (+2 F1 tests asserting gate-precedes-tag ordering and the dry-run no-write contract). Docs + prompt description/title updated to the new ordering.

🟡 F2 — Error-polish gap (carried over)

Underlying delete-* tools still surface raw Request failed with status code 404 for malformed/not-found LUIDs (no up-front format validator). Not re-exercised here (fixtures valid). See tmcp-delete-tools-findings.

🟡 F3 — Server checkout moved (doc/config)

Running local server now serves from main/ @ v2.16.0; the apply prompt exists only there (absent from the v2.14.0 delete-datasource checkout). Prior "main/ is stale" note was outdated. Confirm canonical checkout for the local server.


5. Not Exercised (fixtures too clean)

  • Step 2 ambiguous-name "unresolved (skipped)" path (needs 2 same-name items / same project).
  • Step 4 list-users email-fallback (needs a row lacking ownerEmail).
  • dryRun:false → Step 6 (grace check) + Step 7 (confirmed delete).
  • Empty-report early-exit ("No stale items found above the threshold.").

--- test re-run ----

stale-content-cleanup-apply (PR #405, W-23049417) re-run @ c8eb600 — PASS.

Verified the review fixes: prior top finding (mass-tag-before-gate footgun) is FIXED — tagging moved to post-approval Step 5, Steps 1-3 now read-only, plus a 100-row soft cap. New tag prompt-injection regex and narrow-never-widen both
verified via rendered prompt + schema parse. Unit tests 27/27 (20 apply + 7 HITL).

Ran the full destructive leg live on disposable fixtures (1 datasource + 1 workbook in a throwaway project): preview→tag→token-guard→confirmed-delete, with independent REST ground-truth at each step — tags really applied,
no-token/forged/cross-content tokens all rejected, deletes really 404'd, project torn down clean. Live Step-1 report = 1540 stale items/130GB, so the 100-row guard is relevant.


6. Teardown

All fixtures + throwaway project deleted via REST (204), re-verified gone (404 / project count 0). Scratch dirs /tmp/stale_e2e + /tmp/tabenv wiped.


7. Artifacts

  • Detailed run log: stale-content-cleanup-apply-e2e-results-2026-06-17.md
  • Memory: tmcp-stale-content-apply-prompt.md

Contributor Agreement

By submitting this pull request, I confirm that:

W-23049417 (2B-5): adds the destructive Apply phase of Stale Content
Cleanup as an admin-gated prompt that orchestrates report -> resolve
LUIDs -> tag -> notify -> human-in-the-loop gate -> grace -> confirmed
delete, with dryRun defaulting to true.

W-23050036 (2B-6): adds src/prompts/_lib/confirm.ts, the first shared
_lib helper. Content-agnostic text builders renderHitlGate() and
renderConfirmInstructions(). MCP SDK v1.x has no runtime elicitation,
so HITL is enforced as a prompt-text contract.

Also extracts the duplicated owner-email lookup into
src/tools/web/users/resolveOwnerEmail.ts, now shared by
deleteWorkbook.ts and deleteDatasource.ts, and adds prompt docs pages.

Adds e2e coverage (tests/e2e/prompts/staleContent.test.ts) verified
against a live Tableau Cloud site: prompt registers with
ADMIN_TOOLS_ENABLED=true, is absent when off, and prompts/get returns
the dry-run default, content-type routing, arg pass-through, and the
confirmed-delete phase when dryRun=false. Bumps version 2.14.0 -> 2.15.0.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
@Alon-ST-DATA Alon-ST-DATA force-pushed the asimantov/w-23049417-add-stale-content-cleanup-apply-prompt branch from d0b93fe to ab749a1 Compare June 18, 2026 01:07
@Alon-ST-DATA Alon-ST-DATA changed the title Add stale-content-cleanup-apply prompt and HITL confirm primitive @W-23049417: Add stale-content-cleanup-apply prompt and HITL confirm primitive Jun 18, 2026
Alon-ST-DATA and others added 2 commits June 17, 2026 18:10
…add-stale-content-cleanup-apply-prompt

# Conflicts:
#	package-lock.json
#	package.json
The Apply workflow tagged every resolved report row pending-deletion — a
real write to content — before the human-confirmation break, and dryRun
only blocked the final delete, not the tag. On a live site that meant
thousands of unapproved tag-writes on other owners' content, and the
"safe rehearsal" dry run still wrote.

Move the human gate ahead of any write:
- Steps 1-3 (report, resolve LUIDs, notify) are now strictly read-only.
- The confirmation break comes before tagging; tagging only runs in the
  dryRun:false branch and only for items the user explicitly approved.
- dryRun (default true) now writes nothing at all — no tag, no delete.
- Add a 100-row guard: oversized reports stop and ask the user to narrow
  scope rather than tagging/deleting an unreviewed batch.
- Widen the HITL gate verb from "delete" to "tag or delete".

Update prompt description/title, docs, and unit + e2e tests to match.
Unit 16/16 and e2e 9/9 (live Tableau Cloud) pass.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
* approval before proceeding. Use this AFTER a reversible preview/tag step and BEFORE any confirmed
* destructive call.
*/
export function renderHitlGate({

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MattGPT — automated-assist review on @mattcfilbert's behalf, auth-lead pass]

@Alon-ST-DATA the gating is solid — adminToolsEnabled + assertAdmin + dedicated delete scopes + resourceAccessChecker all hold, dryRun defaults true, and large batches are capped. Two things worth a look:

P2 — say "friction," not "two-step guarantee." This new HITL primitive is honest in its own docstring (L4: "no runtime elicitation/sampling primitive (v1.x)"), but the delete tools it orchestrates compute confirmationToken = sha256(siteId:datasourceId).slice(0,12) — both inputs caller-derivable, so an in-scope admin (or a misaligned model) can confirm:true in one shot with no preview. The existing delete-datasource test already encodes that path.

  • Not an authz hole — the real gates above still apply. It's two-step integrity that's advisory.
  • The gap is wording: the delete-tool description claims the token "forces a deliberate two-step delete." Either enforce it (gate confirm:true on the pending-deletion tag being present — server-written during preview, so the token becomes preview-evidenced not precomputable), or match the words to the code here (friction gate; HITL + grace advisory under SDK v1.x). Datasource deletes are permanent, so I lean enforce. Worth a test pinning whichever you choose.

P3 — apply.ts:107 narrowing can silently widen. requestedTypes.length > 0 ? requestedTypes : ALL_CONTENT_TYPES — passing only unsupported/typo'd/wrong-case types (Flow, Workbooks, workbook) filters to [] → falls back to both types, indistinguishable from "omitted," no warning. Gated by dryRun+HITL so it's minor, but narrow-should-never-widen on a destructive prompt. Suggest: if itemTypes is supplied but resolves empty, stop and list supported types. (Current test only covers a valid Workbook.)

Nice extraction of resolveOwnerEmail into a shared helper, btw.

@Alon-ST-DATA Alon-ST-DATA Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks — addressed both.

P2 (two-step wording): went with reword now, enforce as follow-up (datasource permanence noted). The param descriptions and tool docstrings in delete-workbook/delete-datasource (+ both doc pages) no longer claim the token "forces a deliberate two-step delete" — they now say it is a friction gate requiring a deliberate second call, and explicitly note the token is a deterministic hash of caller-known inputs so it does not by itself prove a preview ran. A follow-up to enforce server-side (gating confirm:true on the pending-deletion tag written during preview) will be filed under epic a3QEE000002AWIP2A4 — GUS WI number to follow.

P3 (apply.ts:107 narrowing can widen): fixed — see the thread on that line. Supplied-but-all-unsupported itemTypes now short-circuits with an error listing supported types instead of falling back to ALL.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W-23093455 - Enforce stale-content delete confirmation server-side via pending-deletion tag gate

@Akash-Rastogi Akash-Rastogi left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the PR is in good shape — clean factoring of the HITL primitive, correct F1 fix with tests that pin the gate-before-tag ordering, and DRY extraction of resolveOwnerEmail. Local validation passes (npm run build, npx tsc --noEmit, npm run lint, npm test 1797/1797 incl. all new apply/confirm tests); minor version bump (2.15.1 → 2.16.0) is appropriate for a new feature.

The inline comments below are mostly polish + one prompt-injection hardening I'd like to see addressed before merge (the tag arg interpolated raw into the prompt). One broader thought worth a follow-up: the 100-row guard is a prompt-text instruction only, so the safety guarantee is only as strong as the model's compliance — if you want it hard, consider enforcing it server-side inside get-stale-content-report itself in a separate change.

I did not run e2e/oauth locally (need a live Tableau site / Playwright env). The PR description's manual e2e against admin-insights-tmcp-site @ 10ax covers the F1 re-verification well.

* Above this many report rows, the workflow refuses to tag/delete in one pass and asks the user to
* narrow scope first — guards against an unreviewed mass write across other owners' content (F1).
*/
const LARGE_REPORT_THRESHOLD = 100;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit / caveat — soft guard. LARGE_REPORT_THRESHOLD is enforced only via prompt text in Step 1 (line 151). A model that ignores or summarizes that instruction can still resolve all 3376+ rows. Worth either (a) a code comment here noting the soft-guard nature, or (b) a follow-up to enforce server-side inside get-stale-content-report so the safety guarantee doesn't hinge on model compliance.

@Alon-ST-DATA Alon-ST-DATA Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a SOFT GUARD note on LARGE_REPORT_THRESHOLD spelling out that enforcement is prompt-text-only and the guarantee is only as strong as model compliance, with a pointer to the server-side follow-up. A follow-up to enforce the cap inside get-stale-content-report will be filed under epic a3QEE000002AWIP2A4 — GUS WI number to follow.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

W-23093467 - Enforce stale-content row cap server-side in get-stale-content-report

.filter(Boolean)
.filter(isContentType)
: [];
const itemTypes: ContentType[] = requestedTypes.length > 0 ? requestedTypes : ALL_CONTENT_TYPES;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — silent fallback footgun. If the user passes only unsupported itemTypes (e.g. itemTypes: 'Flow'), requestedTypes.length === 0 and itemTypes falls through to ALL_CONTENT_TYPES. A user explicitly asking for Flow (perhaps thinking it's supported) inadvertently runs cleanup on Workbook + Datasource — the opposite of what they asked for.

Suggest erroring / short-circuiting when the supplied list is non-empty but contains zero supported types, or surfacing dropped values in the prompt similar to the unresolved-LUID skip pattern.

Also (minor): requestedTypes isn't deduplicated. itemTypes: 'Workbook,Workbook,Datasource' produces a duplicate routing row and duplicate confirmInstructions block. Array.from(new Set(requestedTypes)) would handle it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed both.

  • Silent widen: if itemTypes is supplied but resolves to zero supported types, the prompt now short-circuits and returns an error listing the supported types instead of falling back to ALL_CONTENT_TYPES. Partial-unsupported (e.g. Workbook,Flow) still runs the supported subset but surfaces a NOTE: ignoring unsupported itemTypes: … line, mirroring the unresolved-LUID skip pattern.
  • Dedup: suppliedTypes is now Array.from(new Set(...)), so Workbook,Workbook,Datasource produces a single routing/confirm block.

Added tests for the all-unsupported error, the dropped-types note, and dedup.

Comment thread src/prompts/staleContent/apply.ts Outdated
// One confirm-instruction block per delete tool in scope, so the wording is exact per tool name.
const confirmInstructions = routing
.map(({ deleteTool }) =>
renderConfirmInstructions({ toolName: deleteTool, itemNoun: 'stale item' }),

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit — prompt-real-estate. renderConfirmInstructions is rendered once per deleteTool in scope (today 2 blocks; more as content types grow). The two blocks differ only in the tool-name backticks, and the routing table above already lists the tool names. Could collapse to a single block instructing the model to apply the rule to whichever deleteTool matches the item's itemType from the routing. Not a bug — just keeps the prompt tighter.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — renderConfirmInstructions now takes a toolRef and the prompt renders a single block pointing at the routing table ("the deleteTool the routing table maps the item's itemType to") instead of one block per delete tool. Test pins that only one confirm block is emitted regardless of how many delete tools are in scope.

: [
'**Step 5 — Tag approved items (reversible).** ONLY for the items the user explicitly approved above, ' +
"call each item's `deleteTool` with `confirm` omitted " +
`and \`tag: "${tag}"\` (and the resolved \`idArg\` value). This tags the item '${tag}' (reversible, visible in the ` +

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Medium — prompt-injection vector. tag: "${tag}" interpolates user-controlled text directly into the workflow prompt. The Zod schema (line 63-70) enforces only .max(200) — no character class. A tag like safe", confirm: true, confirmationToken: "... could disrupt the surrounding instructions or attempt to coerce the model into auto-confirming. The HITL gate still defends against actual destruction, but this widens the attack surface unnecessarily.

Fix options:

  1. Tighten the schema, e.g. .regex(/^[A-Za-z0-9 _\-]+$/, 'tag must be alphanumeric / space / underscore / dash').
  2. Escape the value before embedding: ` ${JSON.stringify(tag)} ` — at least closes any quote/backtick injection cleanly.

Option 1 is the simpler, more defensible choice given Tableau tag conventions.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed with option 1. The tag arg schema now enforces /^[A-Za-z0-9 _-]+$/ (letters, numbers, spaces, underscores, dashes), so quote/backtick injection into the prompt text is rejected at validation. Updated the describe text to state the allowed character class.

Comment thread src/prompts/_lib/confirm.ts Outdated
/** Singular noun for the content, e.g. "workbook or data source". */
itemNoun: string;
/** Number of items queued for the action, if known. */
itemCount?: number;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — unused parameter. itemCount is declared and tested (emits the item count when provided), but the only real caller (apply.ts) can't know the count at prompt-render time — the count materializes only after the model calls get-stale-content-report at runtime. As a result this parameter is reachable from tests but not from production paths.

Either drop it from the API (and the test) or add a // reserved for future use comment so the next reader doesn't assume it's wired through.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped itemCount entirely from renderHitlGate (and its test) — you're right it's unreachable at prompt-render time, the count only materializes after the model calls get-stale-content-report at runtime. Removing it also moots the "Present the 1 workbook" wording nit.

Comment thread src/prompts/_lib/confirm.ts Outdated

return [
`🛑 STOP — REQUIRED HUMAN CONFIRMATION before any ${action}.`,
`Present ${countText} ${itemNoun}(s) queued for ${action} to the user and ask them to explicitly ` +

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit — wording when action is multi-word. With action: 'tag or delete' (the apply.ts caller), the rendered text reads:

Present the stale item(s) queued for tag or delete to the user…

"queued for tag or delete" is grammatically awkward — single verbs ('delete') read cleanly but compound verbs don't. Two clean options:

  • Rename actionactionPhrase and let callers pass noun-friendly strings ('tag or delete' is fine in "before any …" but bad in "queued for …"; callers can adjust phrasing).
  • Split into actionVerb ('tag or delete') + actionGerund ('tagging or deletion'); use the gerund in "queued for …" / "in this conversation" slots.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reworked the API to fix both. action is now split into actionVerb ("tag or delete", used in the do-NOT line) and actionGerund ("tagging or deletion", used in the "queued for …" slot), and itemNoun is split into itemNounSingular/itemNounPlural so callers aren't forced through an (s) suffix that doesn't pluralize every noun. Tests updated to assert the gerund renders in the "queued for" slot.

* back to no email. Falls back to the owner's name (the login/username, an email on Tableau Cloud)
* when no email field is returned.
*
* Shared by the delete tools (delete-workbook, delete-datasource) and the content-owner-emails

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit — JSDoc references a non-existent helper. This says "Shared by the delete tools (delete-workbook, delete-datasource) and the content-owner-emails helper" but no content-owner-emails helper exists in this PR. Either drop the trailing reference, or add a TODO if it's coming in a follow-up.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed — dropped the content-owner-emails reference. The JSDoc now reads "Shared by the delete tools (delete-workbook, delete-datasource)…" only. (That helper was removed earlier in 2B-4; the stale mention slipped through.)

Comment thread src/prompts/_lib/confirm.ts Outdated

return [
`🛑 STOP — REQUIRED HUMAN CONFIRMATION before any ${action}.`,
`Present ${countText} ${itemNoun}(s) queued for ${action} to the user and ask them to explicitly ` +

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider changing itemNoun to itemNounSingular and adding an itemNounPlural argument since it's not guaranteed (s) makes the noun plural.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — itemNoun is now itemNounSingular + itemNounPlural (callers pass e.g. "stale item"/"stale items"), so we no longer rely on (s). Combined with @Akash-Rastogi's note on the same line, also split action into verb/gerund forms.

Comment thread src/prompts/_lib/confirm.ts Outdated
/** Columns the model should show per item so the human can make an informed decision. */
presentColumns?: ReadonlyArray<string>;
}): string {
const countText = itemCount === undefined ? 'the' : `the ${itemCount}`;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider changing the condition to itemCount === undefined || itemCount === 1 so the message doesn't awkwardly say "Present the 1 workbook" as opposed to "Present the workbook"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — but rather than special-casing === 1, I removed the itemCount parameter altogether (see @Akash-Rastogi's thread on line 30): the count isn't known at prompt-render time, so it was reachable only from tests. With it gone the awkward "Present the 1 workbook" phrasing can't occur.

return `⚠️ WARNING: deleting this data source may break ${parts.join(' and ')}.`;
}

// (owner-email resolution moved to ../users/resolveOwnerEmail.js — shared with delete-workbook.)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove comment, not valuable.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

`Tableau UI). Defaults to '${DEFAULT_PENDING_DELETION_TAG}'.`,
),
dryRun: z
.enum(['true', 'false'])

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not boolean?

@Alon-ST-DATA Alon-ST-DATA Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MCP prompt arguments are string-only on the wire — the GetPrompt request carries arguments as { [key: string]: string }, so a z.boolean() arg would never receive an actual boolean. Modeled it as z.enum(['true','false']) and parse with args.dryRun !== 'false'. Added a code comment explaining this.
@anyoung-tableau if you still disagree here I can re-look into using a boolean instead.

@Alon-ST-DATA

Alon-ST-DATA commented Jun 19, 2026

Copy link
Copy Markdown
Contributor Author

Thanks all for the thorough review. Pushed a commit addressing every inline thread; summary:

Behavior fixes

  • Prompt-injection (tag): schema now regex(/^[A-Za-z0-9 _-]+$/) — quote/backtick injection rejected at validation.
  • Silent widen on itemTypes: all-unsupported now errors with the supported list instead of falling back to ALL; partial-unsupported runs the supported subset and surfaces a dropped-types note; inputs deduped via Set.

confirm.ts API cleanup

  • Dropped unreachable itemCount.
  • Split actionactionVerb/actionGerund and itemNounitemNounSingular/itemNounPlural for clean multi-word/plural wording.
  • renderConfirmInstructions takes a toolRef; the prompt now renders one block pointing at the routing table instead of one per delete tool.

Wording honesty (P2)

  • delete-tool param descriptions/docstrings + both doc pages no longer claim the token "forces a deliberate two-step delete" — reworded to a friction gate and explicitly note the token is a deterministic hash of caller-known inputs, so it does not by itself prove a preview ran.

Misc: soft-guard note on LARGE_REPORT_THRESHOLD; dryRun string-enum rationale comment (MCP prompt args are string-only); stale content-owner-emails JSDoc ref dropped; low-value moved-comment removed.

Follow-ups to be filed (both under epic a3QEE000002AWIP2A4, GUS WI numbers to follow — per the two "make it hard, server-side" suggestions from @Akash-Rastogi and @mattcfilbert):

  1. Enforce the two-step delete server-side by gating confirm:true on the pending-deletion tag written during preview (turns the advisory friction gate into a real one).
  2. Enforce the stale-content row cap server-side inside get-stale-content-report, so the safety bound does not hinge on model compliance.

Validation: npm run build, tsc --noEmit, npm run lint clean; full suite green (the one jsdom unhandled-error is a pre-existing env issue, unrelated to these changes).

Alon-ST-DATA and others added 2 commits June 19, 2026 12:06
…rimitive cleanup

Review fixes for stale-content-cleanup-apply (W-23049417):

apply.ts
- tag arg: enforce /^[A-Za-z0-9 _-]+$/ to close a prompt-injection vector
  (tag is interpolated into the workflow prompt text)
- itemTypes: error instead of silently widening to all types when the
  supplied list resolves to zero supported types; surface dropped
  unsupported values; dedupe via Set
- render a single confirm-instruction block pointing at the routing table
  instead of one block per delete tool
- document LARGE_REPORT_THRESHOLD as a prompt-text soft guard
- comment why dryRun is a string enum (MCP prompt args are string-only)

confirm.ts (HITL primitive)
- drop unreachable itemCount param (count unknown at prompt-render time)
- split action -> actionVerb/actionGerund and itemNoun -> singular/plural
  so multi-word actions and non-(s) plurals read cleanly

delete tools + docs
- reword the confirmationToken "two-step" claim to an honest friction
  gate: the token is a deterministic hash of caller-known inputs and does
  not by itself prove a preview ran (server-side enforcement is a follow-up)

misc
- resolveOwnerEmail: drop stale content-owner-emails JSDoc reference
- deleteDatasource: remove low-value moved-code comment

Tests updated for the new confirm API and the itemTypes error/dedup/note
paths. build, tsc, lint, and unit suite green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants