Release 11.26.0 by kaihaase · Pull Request #547 · lenneTech/nest-server

kaihaase · 2026-05-31T12:03:35Z

AI Assistant Module: a fully opt-in, multi-tenant-capable, OWASP-aligned assistance system with DB-backed LLM connections, role-filtered tool registry, governed safety mechanisms, and MCP server integration — introduced on branch feature/ai-module.

Core Features:

DB-backed LLM connections with AES-256-GCM encrypted API keys (admin CRUD, hasApiKey output, production-secret guard)
Provider abstraction (OpenAiCompatibleProvider, ClaudeCliProvider) with auto-detection of supportsNativeTools / supportsJsonResponse / contextWindow
Prioritized connection-resolution chain (global default → tenant default → user default → client selection → tenant/admin enforced)
Role-filtered tool registry with mutating / destructive flags and pre-flight authorize() hooks
Plan mode with all-or-nothing pre-flight authorization of every planned step
Confirmation policy (mutating.default/enforced, destructive always-confirm, persistent tool grants "remember my choice")
Scoped tool policies (deny / ask / allow against tool arguments via regex)
Lifecycle hooks (PreToolUse / PostToolUse / SessionStart / Stop)
Token budgets per user AND per tenant (day / month / none reset windows, cumulative usage report, HTTP 429 with i18n translation)
Multi-turn conversations with $push-based message persistence and capped retention
SSE streaming endpoint (POST /ai/stream) plus REST and GraphQL single-shot
Multi-modal attachments (image URLs / dataUrls in prompts)
Admin-defined named agent modes with allowedTools filter and prompt addendum
Self-optimizing prompts (admin-editable tenant-scoped slots with override/reset, fragment-based builder, soft-delete via enabled: false)
Governed learning loop (prompt hints from tool failures, admin-approval-gated or autoApply)
User-facing prompt templates (scope: user|tenant, owner-only mutations)
Runtime placeholder registry ({{userId}}, {{roles}}, {{tools}}, …, project-specific placeholders dynamically registrable)
LLM-driven context compaction on context-window overflow with hard-trim fallback
Deferred tool schemas plus built-in search_tools meta-tool for large tool catalogs
Built-in ask_user_question tool for interactive clarification
Audit logging into aiInteractions (admin-readable, prerequisite for budgets)
MCP server (/ai/mcp Streamable HTTP) with lazy-loaded @modelcontextprotocol/sdk and 503 fallback when the SDK is missing
OAuth 2.1 for MCP clients (HMAC-SHA256 access tokens with timingSafeEqual, PKCE S256-only, refresh-token rotation bound to client ID, dynamic client registration with persisted client_secret)
SSRF allowlist for connection base URLs (ai.allowedBaseUrlHosts)
Per-user rate limiting (max / windowSeconds)
Three-layer security model (@Restricted / @Roles / securityCheck) with global stripping of apiKeyEncrypted and all token fields
Multi-tenancy isolation on slots, prompts, hints, budgets, and conversations
Full override hooks for every collaborator via ICoreModuleOverrides.ai

* feat(ai): add AI assistant module foundation (connections, tools, orchestrator) Adds an extensible AI-assistant layer to the core: - DB-backed LLM connections (CoreAiConnection) with AES-256-GCM-encrypted API keys (admin CRUD; key never returned, only hasApiKey) - Provider abstraction (ILlmProvider) + fetch-based OpenAiCompatibleProvider (mittwald/OpenAI-compatible) + LlmProviderFactory - Global AiToolRegistry with role-filtered, self-registering tools (IAiTool/AiTool) - CoreAiService orchestrator with emulated tool calling (mittwald has no native tool calling), rate-limit + audit hooks - GraphQL resolver + REST controller (aiPrompt + connection CRUD) - CoreAiModule.forRoot (autoRegister + overrides), `ai` config in IServerOptions, CoreModule wiring, exports, FRAMEWORK-API regen - Example User tools + AiToolsModule in src/server - Docs (README, INTEGRATION-CHECKLIST, configurable-features) + AI-MODULE-PLAN.md - Tests: unit 7/7, e2e 6/6 (full suite 85/85) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): persist prompt runs as audit records (Phase 3) - CoreAiInteraction model (aiInteractions, admin-only) + CoreAiInteractionService with system-internal record() - CoreAiService.audit() persists when ai.audit is enabled (optional injection, never breaks a prompt response) - Admin read endpoints (findAiInteractions/getAiInteraction) on resolver + controller - ai.audit config flag; model/service wired into CoreAiModule (+ override option) - Test: prompt persists an audit record (e2e 7/7) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): multi-turn conversations with persisted message history (Phase 4) - CoreAiConversation model (aiConversations) + CoreAiMessage subdoc; owner-scoped via securityCheck (creator/admin only) - CoreAiConversationService with appendMessage() ($push, never round-trips the subdocument array through update()) - CoreAiService loads prior turns into the LLM context when conversationId is given and appends the user+assistant turns after the run - Owner-scoped CRUD endpoints (create/find/get/delete) on resolver + controller - Model/service wired into CoreAiModule (+ override option), exports - Test: 2-turn conversation keeps context and persists 4 messages (e2e 8/8) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): stream prompt answers via SSE (Phase 5) - CoreAiService.promptStream(): emits action events, then the answer as token chunks, then a final event (reuses prompt() so history/audit/persistence apply) - AiStreamEvent type; POST /ai/stream SSE endpoint (raw @Res, role-guarded) - Test: streaming yields action/token/final, tokens concatenate to the answer (unit 8/8, e2e 8/8) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): require confirmation for destructive tool actions (Phase 6) - IAiTool.destructive flag; CoreAiPromptInput.confirm - CoreAiResponse.requiresConfirmation + pendingActions - Orchestrator halts on a destructive tool call until the prompt is re-sent with confirm: true (no execution, no conversation persistence while pending) - Example destructive tool delete_user (admin) added to the reference tools - Test: destructive tool blocked until confirmed, then executes (unit 9/9) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): expose tool registry as an MCP server (Phase 7) - CoreAiMcpService: role-filtered mcpListTools()/mcpCallTool() (testable logic) and lazy createServer() using the low-level MCP SDK Server with JSON-schema tool definitions (no zod conversion) - CoreAiMcpController: Streamable HTTP at /ai/mcp (POST/GET/DELETE), per-session McpServer bound to the authenticated user (Bearer via @CurrentUser), session map with eviction, MCP-style 401 + WWW-Authenticate - ai.mcp config flag; controller registered only when enabled; @modelcontextprotocol/sdk added (fixed 1.29.0, lazy-loaded) - Tests: MCP role gating + permitted/forbidden calls (unit 10/10); MCP 401 over HTTP + app boot with MCP (e2e 9/9) - Docs (README, configurable-features) + AI-MODULE-PLAN updated Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ai): keep JSON scalar available for bare GraphQL apps; isolate prompt e2e - Package types using the JSON scalar (AI models) leak into the global GraphQL type registry, so any schema build needs the JSON scalar provided. Real apps provide it via ServerModule; the framework-internal error-code-scenarios bare apps now provide it too (one JSON provider per app — no consumer impact). - Make the AI prompt e2e deterministic under parallel runs by exercising a registered test tool instead of find_users (shared User collection). Full e2e suite green: 61 files, 1770 tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): plan mode + confirmation policy + client metadata + prompt enrichment (Phases 9-12) Phase 9 — Plan mode (Goals 1/2/5): input.mode='plan' produces a full plan, then pre-flight authorizes ALL steps (registry role filter + optional IAiTool.authorize() dry-run) BEFORE executing anything. If any step is not permitted, NOTHING runs and a translated (de/en) error with deniedActions is returned. Otherwise steps run in order. Phase 10 — Confirmation policy for mutating actions: IAiTool.mutating; ai.confirmation.mutating { default, enforced }; client override via input.requireConfirmation (ignored when enforced); destructive tools always require confirmation. Applies to both auto and plan modes. Phase 11 — Client metadata: input.metadata (URL, navigation, console logs) injected as a clearly-delimited, size-capped, UNTRUSTED context message (prompt-injection hardening). Phase 12 — Prompt enrichment: system prompt now includes the user's roles + available tools and optional system documentation (ai.documentation / overridable getDocumentation()). Refactor: prompt() dispatches to runAuto/runPlan via shared prepareRun; extracted helpers (authorizeCall, confirmationRequiredFor, translate, appendClientContext, …). CoreAiResponse gains plan/denied/deniedActions. Tests: unit 19/19, ai e2e 9/9. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): per-user daily cost/budget enforcement (Phase 8) - ai.budget { maxPromptsPerDay, maxTokensPerDay } enforced in prepareRun before any LLM call; exceeding it aborts with HTTP 429 + translated message - CoreAiInteractionService.usageSince() aggregates today's prompts/tokens per user - ai.defaultMode lets admins default to plan mode; ai.confirmation/documentation typed - Tests: budget block + under-budget pass (unit 21/21) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): MCP full-handshake test + robust MCP user resolution (Phase 7b.1) - CoreAiMcpController resolves the user via req.user and, as a fallback, by verifying the Bearer token directly (BetterAuthTokenService) — so MCP works even though the S_EVERYONE guard does not populate the user - Full MCP protocol test over the SDK in-memory transport: initialize handshake → tools/list (role-filtered) → tools/call execution (unit 22/22) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): MCP OAuth 2.1 security core + provider + mount helper (Phase 7b.2) - CoreAiMcpOAuthService: HMAC-signed access tokens (constant-time verify), PKCE S256 verification, MongoDB-backed stores (clients/codes-TTL/refresh), loadUser, and buildOAuthProvider() implementing the SDK OAuthServerProvider (clients store, PKCE challenge lookup, code/refresh exchange, access-token verification) - mountAiMcpOAuth(app) helper to mount mcpAuthRouter in main.ts (lazy SDK) - MCP controller now also accepts OAuth access tokens when ai.mcp.oauth is enabled - ai.mcp gains { oauth, oauthSecret }; interactive consent is overridable (authorizeConsent) and documented for consumer main.ts integration - Tests: token roundtrip/tamper/expiry + PKCE S256 (unit 26/26) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(ai): finalize AI module docs; remove temporary implementation plan - README: plan mode, confirmation policy, client metadata, prompt enrichment, budget, and MCP OAuth 2.1 sections - INTEGRATION-CHECKLIST: advanced config + main.ts OAuth mounting step - configurable-features: full ai.* config reference - Remove AI-MODULE-PLAN.md (all backend phases done; full suite 1786 tests green) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): per-user/tenant token budgets with defaults + usage reporting (Phase 13) - CoreAiBudgetLimit model (aiBudgetLimits, admin CRUD) for per-user/per-tenant limit overrides; CoreAiBudgetService resolves override → ai.budget default → unlimited (0/missing = unlimited). period day/month/none with resetAt. - ai.budget restructured: { period, user:{maxTokens,maxPrompts}, tenant:{...} }. - Enforcement (user OR tenant) before the run → HTTP 429 + translated message; usage read via a read-only native count over aiInteractions (tenant filter). - tenantId captured on aiInteractions (tenant plugin) for per-tenant accounting. - Every response carries a compact `budget` summary (promptTokens, usedTokens, remainingTokens, resetAt); full breakdown via aiUsage query / GET /ai/usage. - Admin budget-limit CRUD + aiUsage endpoints (resolver + controller); module wiring, exports, docs. Replaces the flat Phase-8 budget. - Tests: unit (resolve/assert/summary/usage-info) + e2e (limit CRUD, response budget, 429 enforcement, aiUsage). Full suite 1792 green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(ai): make the module fully provider-agnostic (capabilities, no vendor names) - ILlmProvider declares LlmCapabilities { nativeTools, jsonResponse, systemPrompt } (replaces supportsNativeTools); orchestrator compensates across all gradations. - Capabilities configured per connection (supportsNativeTools, supportsJsonResponse); OpenAiCompatibleProvider derives them and sends native tools / response_format only when supported. - Removed all concrete vendor/runtime names from code; docs neutralized to describe the OpenAI-compatible API shape (protocol, not a vendor) + capability gradations. - config.env example seed genericized (AI_BASE_URL/AI_API_KEY, only seeded when set). - Tests + docs updated. Unit 31/31, ai e2e 10/10. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * AI module: DB-backed connection resolution chain (provider-agnostic) Adds a prioritized, fully overridable connection-resolution chain so projects with multiple LLM connections can flexibly pick one per request. No connections → AI handling is disabled (translated "unavailable" response); exactly one → it is the implicit default; multiple → resolved via 8 ascending-priority layers. Resolution order (later overrides earlier): 1 global default (isDefault) — soft (must be available to tenant) 2 tenant default (preference) — soft 3 user default (preference) — soft 4 client selection (input) — soft 5 tenant-enforced (preference) — hard (mandatory, wins regardless) 6 admin-enforced global (flag) — hard 7 admin-enforced per tenant (flag) — hard 8 code override (serviceOptions) — hard (deliberate, trusted top layer) Each layer is an overridable protected method; resolutionLayers() can be reordered/replaced. Availability is restrictable per tenant via Connection tenantIds (empty = all tenants). New: - CoreAiConnectionResolverService (the chain; overridable per project) - CoreAiConnectionPreference model/input/service (tenant/user defaults + tenant-enforced, unique per (scope, refId)) - CoreAiAvailableConnection model (non-sensitive list with selected/locked flags) - Connection fields: tenantIds, enforced, enforcedTenantIds - Endpoints: aiAvailableConnections, aiSetUserConnection (S_USER self-service), admin preference CRUD (GraphQL + REST under /ai/connections/*) Orchestrator now resolves the connection via the chain (falls back to the plain connection service when no resolver is wired), returns a denied response when no connection is usable, and honors the serviceOptions._aiConnectionId code override. Module wiring, ICoreModuleOverrides.ai (connectionResolver, preferenceService), barrel + top-level exports, README/INTEGRATION-CHECKLIST/configurable-features docs, and FRAMEWORK-API.md updated. Tests: 14 unit (resolution chain incl. each layer, availability filtering, one=default, none=disabled, subclass override) + 5 e2e (per-tenant restriction, self-service validation, tenant-enforced lock, disabled response, endpoint auth). Full suite green (1811 tests). Removes the temporary AI-MODULE-PLAN.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * AI MCP: silence false-positive prefer-add-event-listener lint warning The MCP SDK transport exposes `onclose` as a callback property, not a DOM EventTarget, so addEventListener does not apply. Add a targeted inline disable. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * AI connection resolution: robustness, validation, dedupe & cleanup Four follow-up optimizations to the connection-resolution chain: 1. Robustness — the chain now drops a selection that points to a deleted/disabled connection (orphaned enforced preference or stale code override) with a warn log and degrades to the fallback, instead of returning a dead id that made connectionService.resolve() throw a 404 mid-prompt. 2. Admin validation — CoreAiConnectionResolverService.setPreference() verifies the connection exists and is usable before persisting; the admin GraphQL/REST preference endpoints route through it (fail-fast instead of a dangling preference). 3. Performance — tenantDefault (layer 2) and tenantEnforced (layer 5) now share a single tenant-preference DB read per resolution (memoized on the ctx via WeakMap). 4. Cleanup — deleting a connection removes preferences pointing to it (PreferenceService.deleteByConnectionId + ConnectionService.delete override with an @Optional preference service; best-effort, never fails the delete). No DI cycle: resolver → {connection, preference}; connection → preference; preference → none. Tests: +4 unit (P1 stale enforced + stale code override degrade, P2 setPreference validation, P3 single tenant query) and +2 e2e (P4 preference cleanup on delete, admin setPreference validation). Full suite green (1817 tests). README updated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * AI module: address review findings (perf, security, docs, tests) Implements all optimizations from the multi-agent review of feature/ai-module. Performance: - getUsage() now sums prompts/tokens server-side via a $group aggregation instead of loading every aiInteractions doc per prompt - Added compound indexes { userId, createdAt } and { tenantId, createdAt } on aiInteractions for the budget period query - buildSummary() skips the usage aggregation for unlimited users (no finite limit) - Conversation history loads via a lean, projected, $slice-capped read (loadRecentMessages) instead of a hydrated get() running the full process() pipeline over the whole messages array - appendMessage() caps the messages array with $slice (-500) Security: - Adopted the ErrorCode registry for all thrown AI exceptions; added the LTNS_0600–LTNS_0608 AI error codes (de/en translated) - Added enum validation (IsIn) for mode / scope / period and @MaxLength on prompt - CoreAiConnectionPreference model tightened from S_USER to ADMIN - CheckSecurityInterceptor now MERGES (union) project secretFields with the framework defaults so a custom list can't drop password/apiKeyEncrypted/etc. - Optional SSRF allowlist ai.allowedBaseUrlHosts (opt-in; local providers like Ollama still work by default) - ai-crypto getKey() and provider mapNativeToolCalls() are now protected (overridable) Docs: - Removed all residual vendor names from shipped code (provider-agnostic) - Reconciled the AI override fields across ICoreModuleOverrides.ai (now 10), configurable-features.md, and the README override table - .env.example: added AI env vars incl. NSC__AI__ENCRYPTION_SECRET - FRAMEWORK-API generator now emits IAi / IAiRateLimit / IAiDefaultConnection - REQUEST-LIFECYCLE.md documents the AI/MCP/SSE/OAuth entry points - New migration guide migration-guides/11.25.x-to-11.26.0.md - Fixed the MCP controller JSDoc (resolveUser, not @CurrentUser) + provider chat() JSDoc Tests: - Unit: OpenAiCompatibleProvider (capabilities, mapNativeToolCalls, transport/HTTP error mapping, allowlist enforcement) + OAuth buildOAuthProvider wiring - E2e: HTTP secret-exposure (apiKeyEncrypted never returned), admin 403 for a non-admin, authenticated self-service 200, SSE framing, shipped get_user tool ownership (S_SELF) and admin-only delete_user visibility Full `pnpm run check` green (audit, format, lint, tests, build, server start). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * AI module: capability auto-detection + review remediation Auto-detection of JSON / native-tool support (provider-agnostic). Connection flags supportsJsonResponse / supportsNativeTools are now OPTIONAL: undefined = auto-detect, explicit true/false = authoritative (never probed). Removed their mongoose default so "not set" stays distinguishable. Detection (A + B, sharing one probe path): - A (eager): on create with an undefined flag, the endpoint is probed once and the result persisted. Best-effort — never fails the create. - B (lazy): if a flag is still undefined at prompt time, the orchestrator probes once and persists; until then the safe emulated baseline applies. - On demand: admin detectAiConnectionCapabilities / POST /ai/connections/:id/detect-capabilities. Probe (OpenAiCompatibleProvider.detectCapabilities, optional on ILlmProvider): response_format: json_object (2xx → JSON) and a trivial tool with tool_choice:'required' (2xx WITH tool_calls → native tools; 4xx / silent-ignore → unsupported). config.env seed only sets the flags when AI_SUPPORTS_* env vars are provided. Review remediation catalog (all 6): 1. GraphQL resolver tests (admin findAiConnections happy-path + non-admin Access-denied) 2. get_user ownership denial now asserts a typed 401/403 error (was rejects.toBeDefined) 3. Dropped the redundant single-field index on aiInteractions.userId (compound covers it) 4. Removed the duplicate IsNotEmpty() on prompt (auto-applied for required fields) 5. assertWithinBudget(language) documented as deprecated/unused (kept for compat) 6. Added a mountAiMcpOAuth mount-wiring unit test Tests: +7 unit (detectCapabilities x3, mount, GraphQL coverage) and +6 e2e (eager/ explicit/endpoint/403 detection, GraphQL admin+denial). README, IAiDefaultConnection JSDoc, and the migration guide document auto-detection. Full `pnpm run check` green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore(review): persist lt-dev review-agent learnings Durable knowledge captured by the lt-dev review agents during the AI-module reviews (backend, docs, performance, security, test reviewers): e.g. AI secret stripping, FRAMEWORK-API generator allowlist, and the doc surfaces a configurable feature must touch. Consistent with the already-tracked agent-memory structure. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ai): harden emulated tool-calling prompt against fabricated execution In emulated tool calling (providers without native function calling), weak models sometimes reply with a natural-language "action done" message without ever emitting the tool_calls request — so the tool never actually runs and the confirmation gate is bypassed (a false positive, never a real side-effect: execution can only happen via executeToolCall() + the gate). Add an explicit instruction to the emulated tool protocol: never claim to have executed/performed/deleted/updated/created anything without first emitting a tool_calls request and receiving its TOOL_RESULTS. Document the residual limitation (and the intact security guarantee) in the module README, recommending native tool-calling backends for action-heavy workflows. Observed during real fullstack E2E testing against a live OpenAI-compatible endpoint without native tool support. Full e2e suite green (1839 tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style(ai): oxfmt the emulated-mode limitation note in README Format-only follow-up to 786aee3 — normalize markdown emphasis (oxfmt). No content change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): add opt-in ClaudeCliProvider (Claude Code CLI as an LLM backend) Proves the module connects to all three backend kinds — external (hosted OpenAI-compatible), local (e.g. Ollama via the same provider) and the local Claude Code CLI — via the existing ILlmProvider extension point. ClaudeCliProvider shells out to `claude -p --output-format json` and parses the single result + usage. Security model: - `--tools ""` disables ALL of Claude Code's own tools — it is a pure text generator, so it cannot read files, run shell commands, or reach the network on its own. Tool calling is emulated (the orchestrator executes tools itself via CrudService with the caller's permissions). - `spawn` with an argument array (never a shell) — prompt content can't be interpreted as a command. Conversation is piped via stdin. - Runs in tmpdir (no CLAUDE.md/settings auto-discovery), `--system-prompt` replaces Claude Code's default agent prompt, `--no-session-persistence`, and a timeout that kills the child on overrun. Optional `ai.claudeCli` config (bin/extraArgs/maxBudgetUsd); `apiKey` is forwarded as ANTHROPIC_API_KEY. Opt-in (not auto-registered): register via `factory.registerBuilder('claude-cli', (c) => new ClaudeCliProvider(c))`. Exported from the AI module barrel + top-level index. README documents external / local / CLI connection recipes. +7 unit tests (capabilities, tool-free argv, transcript flattening, result/usage parse, error + transport mapping, factory wiring); live-verified against the real CLI. Full e2e suite green (1846 tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ai): never leak a bare tool-protocol wrapper as the final answer In emulated tool calling a (weaker / JSON-mode) model sometimes replies with a bare protocol wrapper like `{"tool_calls":[]}` instead of a user answer. The orchestrator treated that as the final text and surfaced the raw JSON to the user. Now, when the response parses to a protocol-shaped object with no `final`: - nudge the model once ("reply with your final answer as {\"final\":\"…\"}") and continue the loop, and - if it still returns only a `tool_calls`/`final` wrapper, drop it so the generic "could not produce a final answer" fallback applies instead of leaking JSON. Found during real fullstack E2E with a local Ollama model (qwen2.5) in JSON mode. +2 unit tests (nudge recovers a final answer; bare wrapper never surfaces). Full e2e suite green (1848 tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ai): fail loud in prod on missing secrets + boot self-check for stale keys Hardening to match the existing email/cookie production guards. Previously a missing AI encryption secret only logged a warning and silently used a public, insecure development default — you could ship insecure key storage to production. - AiCryptoService.assertProductionSafe() (run in onModuleInit): throws at boot in production/staging when AI is active but no secret is resolvable (ai.encryptionSecret / NSC__AI__ENCRYPTION_SECRET / SECRETS_ENCRYPTION_KEY). Non-prod keeps the warn-only dev default. getKey() now shares resolveSecret(). - CoreAiMcpOAuthService.assertProductionSafe() (onModuleInit): same guard for the OAuth signing secret, but only when ai.mcp.oauth is enabled. - CoreAiConnectionService: the defaultConnection seed now WARNS when configured incompletely (missing baseUrl/model/name) instead of skipping silently; plus a best-effort boot self-check that logs which connections' stored API keys can no longer be decrypted (e.g. after a secret rotation) — actionable at boot instead of failing deep in a request. Never throws. Behavior change (warn → throw in prod/staging): documented in the AI INTEGRATION-CHECKLIST and the 11.26.0 migration guide. +2 unit tests (both guards across prod/staging/local × secret/no-secret × oauth on/off). Full e2e green (1850). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ai): extract emulated tool calls even when the model appends trailing text Emulated tool-call extraction used a naive first-`{` … last-`}` slice. A model that keeps writing after its JSON — e.g. the Claude CLI emits a valid `{"tool_calls":[…]}` and then a self-hallucinated `TOOL_RESULTS:` block in the same turn — produced an unparseable slice, so the tool call was missed entirely: the tool never ran and the raw JSON leaked as the final answer. This broke function calling on any backend that doesn't stop cleanly after the JSON. - extractJsonObject() now prefers the first *brace-balanced* object (string-aware), falling back to the old slice. New protected firstBalancedJson() helper. - Emulated tool protocol now instructs the model to STOP after tool_calls and not write the results itself. Verified end-to-end through the orchestrator with the real Claude CLI: server_time is now actually executed (2 iterations, correct final answer) where it previously failed. +1 unit test reproducing the trailing-text case. Full e2e green (1851). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ai): feed back only normalized tool_calls as the assistant turn After detecting emulated tool calls, the orchestrator pushed the model's raw completion text as the assistant turn. When a model appends a self-hallucinated `TOOL_RESULTS:` block after its tool_calls (observed with the Claude CLI), that fake block was fed back alongside the REAL results, confusing the model — its final answer became a meta-complaint ("these TOOL_RESULTS were not requested by me…") instead of a clean summary, even though the tools executed correctly. Now record a normalized assistant turn containing only `{"tool_calls":[…]}` (name + arguments), never the raw text. Verified end-to-end via Chrome MCP across all three backends (external/mittwald, local/Ollama, Claude CLI): multi-tool calls + result processing now produce clean answers, and the confirm-before-execute gate works on each. +1 unit test (assistant turn never carries hallucinated trailing text). Full e2e green (1852). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): self-optimizing prompts — DB-editable templates, learning loop, context-window handling Makes prompt construction transparent, adaptive and self-improving so non-technical users only enter domain prompts while the system supplies everything the LLM needs — across all backends (external/local/CLI) — and always within the security model. - DB-editable prompt store (CoreAiPromptTemplate, admin CRUD): the system prompt is assembled from keyed fragments (base, permissions, anti_hallucination, output_contract, tool protocol, error_guidance, learned_hints, plan_protocol). Ships rich built-in defaults; DB rows override per key (locale/capability scoped). No prompt text is hard-coded-and-unreachable. - Rich auto-enrichment in CoreAiPromptBuilderService (now template-driven + async): roles, exact allowed tools, tool catalog with parameter schemas, anti-hallucination contract ("never invent/guess; use a tool or say you don't know"), structured output contract, error guidance — placeholders rendered at build time. - Structured tool errors fed back to the model ({ code, message, hint }) so it can self-correct instead of pretending success. - Governed self-improvement loop (CoreAiPromptHint + CoreAiPromptHintService): orchestrator records failure signals (tool_error/exception/not_available); recurring patterns become learned hints. Default governed (admin approves); ai.promptLearning { enabled, autoApply, minOccurrences } can auto-apply. Hints only ADD guidance — never relax permissions. - Per user/session context-window handling: contextWindow per connection (+ ai.contextWindow fallback 8192); fitMessagesToContext trims the oldest session-history turns (keeping the system prompt + current input), truncates as a last resort, and caps oversized tool results (ai.maxToolResultChars). Applied in auto + plan modes. - New override hooks (promptTemplateService, promptHintService), barrel + config exports, IAi config additions. +9 unit tests. Full e2e green (1858). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): admin CRUD for prompt templates/hints + auto-detected context window Phase 5/6 of the self-optimizing prompt system: - Admin endpoints (GraphQL + REST, all @Roles(ADMIN)) to CRUD prompt-template fragments and the learned prompt hints — so admins can edit every prompt text and review/approve/reject the governed learning loop. - Context-window size is now auto-detected per LLM: OpenAiCompatibleProvider probes a local Ollama /api/show then falls back to a known-model table; ClaudeCliProvider returns the model-appropriate size (200k, 1M for 1m variants). detectAndPersistCapabilities() persists the detected window alongside the capability flags; the orchestrator's lazy-detect guard also fires when the window is still unknown. - Tests: +unit (detectContextWindow for known models + claude alias) and +e2e (template override applies to the prompt; learning loop end-to-end with a throwing tool → suggested hint → admin approve → hint in prompt; admin-only prompt-template/hint queries + non-admin denial). Full suite green (61 files, 1863 e2e; 80 ai unit). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): register the claude-cli provider in the example AiToolsModule Demonstrates the opt-in LlmProviderFactory.registerBuilder() pattern in the reference implementation so a connection with providerType 'claude-cli' works out of the box in the framework's own server. No effect on existing tests (no test creates a claude-cli connection); full ai e2e green (32). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(ai): document self-optimizing prompts + context-window handling - README: new "Self-optimizing prompts" (editable templates + governed learning loop) and "Context window" sections; config example + override table extended (promptTemplateService, promptHintService). - configurable-features: AI row + override list updated with the prompt template/hint stores, promptLearning, contextWindow, maxToolResultChars. - INTEGRATION-CHECKLIST: advanced-config example + admin-UI / context-window notes. - migration guide 11.26.0: new-features entry + dedicated subsections. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * style(ai): oxfmt the README and the context-window model table Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): scoped prompt-template fragments (tool:/role:/mode:) Adds an optional `scope` field to CoreAiPromptTemplate. Fragments with a non-empty `scope` are only included when the active run scopes contain it. The prompt builder computes the active scopes from the run: `tool:<name>` for each tool in scope, `role:<name>` for each user role, and `mode:<name>` when a named mode is active (foundation for the upcoming AiMode work). This lets admins author per-tool / per-role / per-mode guidance fragments that the LLM only sees when relevant — keeping the prompt focused for end-user requests instead of inflating it with every possible instruction. +2 unit tests (template-service scope hop + in-builder fallback). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): AskUserQuestion built-in tool — let the LLM clarify before acting For non-technical end users a focused clarifying question often beats a wrong action. The model can now call the built-in `ask_user_question` tool to pause the run and ask the user (with optional multiple-choice options) instead of guessing. The orchestrator detects the tool's sentinel return shape, breaks the loop, and surfaces `CoreAiResponse.pendingQuestion = { question, options? }`. The client renders it and sends the user's answer as the next prompt — the conversation continues naturally, no special response field needed. The tool is auto-registered with `roles: [S_USER]`, non-mutating; permission restrictions are still enforced backend-side regardless of how the prompt is phrased. +1 unit test verifying short-circuit + payload. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): persistent permission decisions (remember-my-choice for mutating tools) End users should not have to re-confirm the same mutating action repeatedly when they have already granted consent. Adds `aiToolGrants` (admin CRUD) and a new `rememberDecision` field on `CoreAiPromptInput`: - When the user confirms (`confirm: true`) AND sets `rememberDecision` to 'conversation', 'user' or 'tenant', the orchestrator persists a grant for that scope after a successful, non-destructive mutating execution. - On future iterations, the confirmation gate consults the grant store: if an active, non-expired grant covers the tool in any scope (user / tenant / conversation), the gate is skipped for that call. Destructive tools NEVER use grants — they always confirm. - Grants only skip the confirmation gate. The permission model itself (`@Restricted`, `@Roles`, `authorize()`) is enforced backend-side regardless. Override the store via `CoreModule.forRoot(env, { ai: { toolGrantService } })`. +2 unit tests verifying skip-on-grant + persist-on-remember. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): lifecycle hooks (PreToolUse / PostToolUse / SessionStart / Stop) Adds a generic hook registry so projects can register compliance-, audit- and sanitization callbacks at well-defined points of the agent loop without forking the orchestrator. Hooks are NestJS providers extending AiHookBase that self-register in the global AiHookRegistry on module init. - `preToolUse(call, tool, event)`: can `{block: true}` the call (returned as a structured BLOCKED_BY_HOOK error to the LLM) or replace `args` (chained sanitization across hooks). - `postToolUse(call, tool, {result, success}, event)`: pure notification — for webhooks, metrics, audit shipping. - `sessionStart(event)`: called once before the first LLM call. - `stop(response, event)`: called once at the end of every run. Errors thrown inside hooks are swallowed + warn-logged — a misbehaving hook must never crash a prompt. Hooks can only ADD restrictions; they cannot relax the existing permission model (@Restricted/@Roles/authorize() still apply). +2 unit tests: blocking hook stops execution + observe hook receives notify. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): scoped tool-policies (deny/ask/allow against tool arguments) Adds a second permission layer underneath the role-based tool gating: admins can author fine-grained allow / ask / deny rules against the arguments of a tool call without needing to write code. Example: a generic dbQuery tool can be opened up to a support role with: - allow `sql` matching `^SELECT\\b` - deny `sql` matching `(?i)\\b(DROP|TRUNCATE|DELETE FROM)\\b` - ask `sql` matching `(?i)\\bUPDATE\\b` Evaluation across all matching policies follows the precedence deny > ask > allow so a deny anywhere always wins; an ask routes the call through the existing confirmation gate even if the tool itself isn't `mutating`. A pure allow lets the call proceed without re-gating. No matching policy → fall through to the existing behaviour. Policies stack across scopes: `tool` (any user), `role`, `tenant`, `user`. Like all other layers, policies can only TIGHTEN — they never relax the underlying permission model (@Restricted / @Roles / authorize() still apply). +2 unit tests verifying deny aborts + ask forces the confirmation gate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * test(ai): use ConfigService.setConfig({reInit:true}) for the postToolUse hook test to avoid config merge leakage from prior tests * style(ai): oxfmt new files (hooks, grant service, builder, orchestrator) * feat(ai): deferred tool-schemas + search_tools meta-tool (#13) With many tools, the JSON-Schema catalog can dominate the system prompt and cost a large chunk of every request. `ai.deferToolSchemas: true` switches the prompt to a NAMES + descriptions only listing; the LLM uses the new built-in `search_tools` meta-tool to fetch a specific tool's parameter schema on demand. Result: massively smaller default context footprint for projects with rich tool registries, with no loss of capability for the model. `search_tools` is role-gated and returns only the tools the current user can already see (defense in depth on top of the orchestrator's role filter). TDD: +2 unit tests (defer-on emits names + hint, schemas hidden; defer-off keeps the full catalog as before). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): LLM-driven context compaction (#7) When a session would overflow the model's context window, the orchestrator now summarizes the oldest non-system / non-last turns into a single short system message via the same connection's provider — instead of dropping them. The result: long sessions stay coherent (cross-turn intent preserved) instead of losing context to hard truncation. Falls back to the hard trim path on any error. Configurable: `ai.compaction: false` to disable. TDD: +1 unit test with a fake provider returning a known summary verifies the oldest turns get replaced by the summary and the system + current prompt are preserved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): named agent-modes (#8) — admin-defined preset assistants A `CoreAiMode` row bundles a curated tool whitelist, optional model override, optional role gating and an optional prompt addendum under a name like `support`, `audit`, `billing`. End-user prompts activate one with `agentMode: 'support'` — the orchestrator then narrows the available tool set to the whitelist (built-in ask_user_question + search_tools always stay available — they are essential for end-user UX). Modes only ever TIGHTEN; they cannot relax the underlying permission model. Combines with the existing prompt-template `mode:<name>` scope filter (#2): admins can author per-mode prompt fragments that fire only when that mode is active. TDD: +1 unit test — a model in `support` mode tries to call a non-whitelisted tool and gets the standard TOOL_NOT_AVAILABLE response. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): multi-modal attachments (#17) End users can now attach images / files to a prompt via `input.attachments` (array of `{mimeType, dataUrl|url, name?}`). The orchestrator forwards them on the user message; OpenAiCompatibleProvider translates them to the OpenAI content-parts shape (`[{type:'text'},{type:'image_url'}]`) — sent to vision- capable backends (mittwald-Ministral, OpenAI, Anthropic via OpenAI-compat gateways, Ollama vision models). Providers without vision support naturally ignore them. Adds LlmAttachment interface + `LlmMessage.attachments` so other providers can adopt the mapping in one place. Powers product flows like "send a screenshot + describe the bug". TDD: +1 unit test verifies attachments survive from CoreAiPromptInput to the provider on the user message. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): MCP-Client — import tools from external MCP servers (#15) `CoreAiMcpClientService.registerExternalClient({ name, client })` takes an MCP-like client (the SDK `Client`, or a duck-typed one), calls `listTools()`, and registers each as a wrapper tool in the global AiToolRegistry under the namespaced name `<clientName>_<toolName>`. The wrapper dispatches `execute()` back to the MCP server's `callTool` and normalises the response into our {success, data, message} shape. Imported tools are role-gated like any other tool and ship with the conservative defaults `mutating: true` + `roles: [S_USER]` — so they always require confirmation unless an admin scoped-policy (or grant) relaxes that. Projects bring their own MCP client (stdio spawn / StreamableHTTP / SSE / OAuth) — this service is the glue that turns it into a first-class set of backend tools. Override the service for custom connection logic. TDD: +1 unit test — a fake MCP client returns one tool, we register it, and verify the wrapper dispatches `callTool` correctly and surfaces the result. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ai): use JSON scalar for attachments field (GraphQL schema build) * style(ai): oxfmt the MCP-client service * feat(ai): expose effective token limit + context-window utilization on responses Two new payload fields drive the chat UI's usage indicators: 1. `CoreAiBudgetSummary.maxTokens` + `scope` ('user' | 'tenant' | 'llm'): resolved as user-limit → tenant-limit → LLM context-window. Lets the client render a usage progress bar against the effective ceiling for THIS user. Null = no limit at all (frontend hides the bar). 2. `CoreAiResponse.contextWindow = { used, total }`: current session token utilization vs. the connection's auto-detected context window. Powers a "closing-circle" indicator (Claude Code-style) that appears once the model is approaching its context limit. Both work even without a user (LLM-window fallback) and even without the budget service (no budget = LLM fallback only). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): user-facing prompt snippets ("Vorlagen") with own/tenant/global visibility A user-facing companion to the admin-only CoreAiPromptTemplate: any signed-in user can author short, named prompt snippets that they can insert into the chat input with one click. Different from the system-prompt building blocks, these are USER prompts (e.g. "Schreibe eine kurze Antwort zu …"). Visibility scopes (enforced server-side; the picker only sees what it's allowed to see): - 'user' — only the owner sees it (default). - 'tenant' — all members of the owner's tenant see it (tenant context required). - 'global' — every signed-in user sees it (creation requires the ADMIN role). Owner-only mutations: only the creator can update/delete; admins still pass via the standard admin pipeline. The full security model: - listVisible(): server-side $or query (own + global + tenant), filtered to enabled snippets, ordered by `order` asc then `name`. - create(): runs the standard pipeline first (validation + per-input whitelist), then writes the system-owned ownerId/scope/tenantId via a direct update — the input DTO deliberately doesn't expose those fields, so `prepareInput` would strip them if we passed them through super.create. - update(): assertOwner first, strip ownerId/tenantId from input, validate scope changes (incl. admin gate for 'global'), then persist tenantId directly when scope changed. - securityCheck(): per-row filter that returns undefined for snippets the caller is not allowed to see (own + tenant-member + global pass). - assertOwner(): ADMIN bypass; otherwise compares row.ownerId to user.id. Module wiring: new options field promptSnippetService, MongooseModule.forFeature schema entry, AI_PROMPT_SNIPPET_CLASS + service token, provider registration, barrel exports. The aiPromptSnippets collection has a unique compound index { ownerId: 1, name: 1 } so a user can't shadow their own snippet by accident. Endpoints (all S_USER): REST GET /ai/snippets POST /ai/snippets PUT /ai/snippets/:id DELETE /ai/snippets/:id GraphQL findAiPromptSnippets createAiPromptSnippet updateAiPromptSnippet deleteAiPromptSnippet Tests: +5 e2e covering scope filtering (own/tenant/global), admin-only global creation, tenant-context requirement, owner-only mutations, and the REST + GraphQL surface. Full suite green (1883 tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * refactor(ai): rename PromptTemplate→Slot + Snippet→Prompt; drop 'global' prompt scope Phase 1 of the AI-naming overhaul. Renames the two prompt-side stores to terms that match how they're actually used in the UI and conversations: * CoreAiPromptTemplate → CoreAiSlot The admin-edited building blocks of the SYSTEM prompt. Each row fills a keyed slot (`base`, `permissions`, `anti_hallucination`, `tool_catalog`, `output_contract`, `tool_protocol_emulated`, `error_guidance`, …) and overrides the framework default for that key. The model already used "slot" in its description text — the file/class/collection names now match. Adds an auto-set `tenantId` field so Phase 2 can introduce tenant-scoped overrides without another schema bump. * CoreAiPromptSnippet → CoreAiPrompt User-facing re-usable prompts ("Vorlagen") — short, named user-prompt presets that can be inserted into the chat input with one click. No longer to be confused with the system-prompt building blocks above. The existing CoreAiPromptInput (= the payload of the aiPrompt mutation, i.e. the user's question to the LLM) keeps its name; the CRUD inputs of the new model are CoreAiPromptCreateInput / CoreAiPromptUpdateInput. Endpoints and GraphQL operations follow: * /ai/prompt-templates → /ai/slots findAi/createAi/updateAi/deleteAiPromptTemplate(s) → findAi/createAi/updateAi/deleteAiSlot(s) * /ai/snippets → /ai/prompts findAi/createAi/updateAi/deleteAiPromptSnippet(s) → findAi/createAi/updateAi/deleteAiPrompt(s) The user-facing prompt scope drops `'global'`. The Backend used to allow admins to create system-wide prompts; admins now have a proper tenant-scoped system-prompt extension mechanism via Slots instead. Remaining valid scopes are `'user'` (private, default) and `'tenant'` (public within the tenant). Backend rejects `scope: 'global'` for everyone — including admins. Collections (aiPromptTemplates, aiPromptSnippets) are renamed to (aiSlots, aiPrompts). The AI module hasn't been used in production yet, so no DB migration is shipped — fresh installs start clean. Module override field renamed from `promptTemplateService`/`promptSnippetService` to `slotService`/`promptService`. Module exports updated. 37/37 ai e2e tests green. Full `pnpm run check` green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): tenant-scoped slots with override/reset + runtime placeholder registry Phase 2 + 3 of the AI naming overhaul. Phase 2 — Slot tenant-scoping + system-default overrides ======================================================== `CoreAiSlot` now carries a `tenantId` that's set system-side from the calling admin's tenant (RequestContext + serviceOptions). Multi-tenant deployments get per-tenant overrides for free; single-tenant deployments stay system-wide (tenantId stays undefined). `CoreAiSlotService`: - create / update / delete now require ADMIN and verify the row belongs to the caller's tenant; tenantId is set system-side and stripped from input. - `resolveFragments(...)` filters DB rows by the request's tenantId and now also honors disabled (soft-deleted) system-default overrides — a row with `enabled: false` matching a system key HIDES the default for that tenant. - New `listEffective(...)` returns the framework defaults overlaid by the tenant's rows, each with `isSystem` / `isOverride` flags so the admin UI can render the right action (Bearbeiten / Zurücksetzen / Deaktivieren / Löschen). - New `resetSystemSlot(id, ...)` — deletes a tenant override row → the framework default applies again. Refuses to operate on custom slots (use `delete` for those). The default-fragments definition was extracted to a module function `getSystemDefaultSlots()` so both the prompt builder and the slot service can reach it without a circular dependency. The builder's `defaultFragments()` is now a thin wrapper that combines the system defaults with the configured `ai.systemPrompt`. New endpoints: - `GET /ai/slots/effective` — system defaults + overrides + customs (admin). - `POST /ai/slots/:id/reset` — reset a tenant override (admin). Phase 3 — Runtime placeholder registry ====================================== New `CoreAiPlaceholderRegistry` service. The 6 system placeholders (`roles`, `tools`, `toolCatalog`, `documentation`, `learnedHints`, `userId`) are registered at boot; projects can add their own via `register({ name, description, resolve })` from any provider. Why a registry instead of hard-coded names in the frontend: * The frontend loads the list via the new endpoint, so admin / user editors see EVERY currently-supported placeholder (including project- specific ones) without a frontend change. * Resolvers stay in TypeScript — no eval, no DB-stored function bodies, no admin-defined runtime code paths. Secure by construction. The prompt builder's `renderContext()` now delegates to the registry when available (falls back to the hard-coded record for legacy paths). New endpoint: `GET /ai/placeholders` (S_USER — placeholders aren't secrets, the resolver implementations stay backend-side). Tests: 43/43 ai e2e green (was 37/37). Added: - listEffective returns 10 system defaults on a clean tenant - admin override + reset round-trip - non-admin → ForbiddenException on slot reads/writes - placeholder registry lists the 6 system placeholders - non-admin user can hit `/ai/placeholders` (200) - project-registered placeholder is honored end-to-end Module override fields added: `placeholderRegistry`. Index barrel exports the new interface + service. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ai): idempotent slot create + user-prompt placeholder resolution + docs Four follow-up optimizations on the AI naming overhaul + tenant-scoped slots + placeholder registry: 1. **Idempotent slot create** — `CoreAiSlotService.create()` now upserts on `(tenantId, key)`: a second "Override anlegen" / "Deaktivieren" action on the same system slot UPDATES the existing row instead of inserting a duplicate. Live-verified via Chrome MCP — two overrides of `base` produce a single DB row with the latest content; the row's `_id` stays stable across edits. 2. **User-prompt placeholder resolution** — `CoreAiService.prompt()` now runs a new `resolvePromptPlaceholders(input, serviceOptions)` pass BEFORE prepareRun, replacing `{{placeholder}}` tokens in the user's prompt text using the runtime placeholder registry. A stored prompt template like "Erkläre dem Nutzer mit ID {{userId}} …" now gets the real id substituted at run time. Unknown tokens are left as-is so plain text with curly braces survives. `promptStream()` inherits the resolution because it delegates to `prompt()`. 3. **assertSameTenant performance** — projects on `tenantId` only when verifying tenant ownership (cheaper than loading the full document). 4. **Documentation** — README, INTEGRATION-CHECKLIST and the 11.25.x → 11.26.0 migration guide now use the new Slot / Prompt / placeholder- registry terminology. The migration guide gains an "AI naming overhaul" section documenting the rename for projects that experimented with pre-release AI builds (cleanup steps: copy data from `aiPromptTemplates` → `aiSlots`, `aiPromptSnippets` → `aiPrompts`; routes / module override names changed; user-prompt scope `'global'` removed). Tests: +3 e2e (46/46 total ai e2e green) — covers (a) idempotent override, (b) user-prompt placeholder substitution with real values, (c) unknown tokens preserved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(ai): make the tool-registration step impossible to miss A fresh AI module on a new project answers "I don't have a tool to do that" for every domain question — that's correct behaviour (LLM is untrusted, the registry starts empty), but new integrators (humans and AI agents) keep reporting it as a bug. The docs now spell this out wherever someone might land first: - **CLAUDE.md** gains an "AI Module: Tools Are Opt-In" section right under Core Principles so AI agents working with the framework see it before starting any AI integration. - **README.md** (AI module) gains a prominent callout above the existing "## Tools" section explaining the no-auto-discovery contract, the three-step path (write tool subclass / group in AiToolsModule / import in ServerModule), the four reference tools, and the security contract (CrudService + serviceOptions; never direct Model.find()). - **INTEGRATION-CHECKLIST.md** §3 + §4 now explicitly flag both as REQUIRED for the assistant to do anything domain-specific, and walk through the copy-from-framework step with the exact import rewrites ('../../../../core/...' → '@lenne.tech/nest-server', user.service path adjustment) plus the boot-still-works fallback. No behavioural change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * feat(ai): cumulative budget summary + provider-quota fallback on connection The token bar showed only the last request's tokens against the LLM context window when no user/tenant budget was configured. That's neither a real "used" value (it's per-request, not per-period) nor a real limit (the context window is per-call, not per-day). Two changes: 1. **Cumulative usage even at scope='llm'.** `buildSummary` now always aggregates the user's running per-period token total (via `getUsage({ userId }, period)`) — same query that backs the hard-budget path. The fallback hierarchy is unchanged: user override > tenant override > config defaults > provider quota > context window …but `summary.usedTokens` is now filled at EVERY level, not just for hard limits. The UI's token bar can therefore show a real "verbraucht im aktuellen Zeitraum" counter against whatever soft fallback applies. 2. **Provider-quota fallback on the connection.** Two new admin fields on `CoreAiConnection`: defaultUserMaxTokens?: number // e.g. 50000 (= soft cap per user) defaultUserMaxPeriod?: string // 'day' (default) | 'month' | 'none' When set, the connection's soft user quota is used as `maxTokens` (still surfaced as scope='llm' so the UI knows it isn't a 429-able hard limit). This is the technical landing spot for the user's "vom LLM-Anbieter ermittelt" requirement — the admin pins the provider's per-user quota once per connection and the UI immediately reflects it. The resolved-connection interface gains the two fields so the orchestrator can pass them through to `buildSummary`; the connection input DTO accepts them for create/update. UI (in nuxt-base-starter): the token bar now reads `usedTokens` for every scope (no more "last request" special case), the scope label for `'llm'` becomes "Anbieter-Quota (weich)", and the tooltip says "Kumulativ (weiches Limit)" so users understand the bar fills over the period. Tests: +2 e2e (48/48 ai e2e total green) — covers cumulative aggregation under the context-window fallback and the new provider-quota fallback. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore(release): 11.26.0 — AI assistant module + tenant-scoped slots + placeholder registry Version bump to ship the AI-module work that has been collecting on feature/ai-module. The complete catalog of changes lives in `migration-guides/11.25.x-to-11.26.0.md`, which the same commit extends with the missing follow-up sections (user-prompt placeholder resolution, cumulative budget summary at every scope, the new `defaultUserMaxTokens` / `defaultUserMaxPeriod` provider quota on `CoreAiConnection`, and an explicit "Tools are opt-in" callout for new integrators). No breaking changes for projects that don't opt in to the AI module (absent `ai` config block → module stays disabled). For projects that experimented with the pre-release AI build, the "AI naming overhaul" section of the migration guide lists every rename (`PromptTemplate → Slot`, `PromptSnippet → Prompt`, the `'global'` user-prompt scope removal, etc.). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(docs): demote spectaql {{placeholder}} interpolation errors to warnings Spectaql renders GraphQL descriptions through Handlebars and turned the AI module's `{{placeholders}}` notation into a hard `Unsupported interpolation` error. That error broke the `docs:bootstrap` script chain — spectaql crashed before writing its tmp index.html, which then surfaced as a cascade of follow-up ENOENT errors. The `{{placeholders}}` notation in slot / prompt content descriptions is intentional: it documents the runtime placeholder registry consumed by `CoreAiPromptBuilderService`, not a Handlebars binding. Sets `spectaql.errorOnInterpolationReferenceNotFound: false` in `spectaql.yml` (must sit under the `spectaql:` block, not at root — verified against spectaql 3.0.9 source / examples). Unresolved interpolations now log as `WARNING: Unsupported interpolation encountered: "{{placeholders}}"` and the docs build runs to completion. Verified: `pnpm run docs:bootstrap` exits 0; `public/index.html` is regenerated; the ENOENT cascade in `server.log` is gone. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore(boot): silence dotenv startup banner `dotenv` v17.4.2 emits a "◇ injected env (N) from .env // tip: …" line on every `config()` call. The tip is randomly chosen from a hard-coded TIPS array that includes promotional links to other Motdotla products (www.dotenvx.com, www.vestauth.com). Since the framework already has structured logging via NestJS, the banner is pure noise. `{ quiet: true }` is dotenv's surgical kill switch — it disables the two status `_log()` calls inside dotenv (the "injected env" line and the encrypted-`.env.vault` loader line, which this codebase doesn't trigger anyway). It does NOT silence `_warn()` (e.g. missing `.env.vault` for a configured `DOTENV_KEY`) or `_debug()` (only active with `debug: true`), so genuine misconfiguration warnings still surface. Applied at both `dotenv.config()` call sites — `src/config.env.ts:19` (top-level config bootstrap) and `src/core/common/helpers/config.helper.ts` (env-aware helper used by consumer projects). Verified by running `pnpm start` after the change — server boot log now starts directly with `Configured for: local` and the regular NestJS InstanceLoader output; both `injected env` lines and the `vestauth` promo are gone. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * chore: silence spectaql warnings, clear lint hints, fix vitest oxc warning - Reword AI Slot descriptions to drop `{{placeholders}}` notation (the registry endpoint at `/ai/placeholders` already documents the active token set) — silences three spectaql "Unsupported interpolation" warnings without losing information. - Resolve oxlint warnings: switch `!= null` / `== null` to strict comparisons in the placeholder/prompt resolver paths, rename `registry` → `placeholderRegistry` in two test cases to avoid shadowing the outer `AiToolRegistry` variable, and add a `no-useless-constructor` disable + explanatory comment on the two built-in tool subclasses (their constructors are NOT useless — they re-export the protected `AiTool` constructor as public so NestJS DI can instantiate them). - Add `oxc: false` to `vitest.config.ts` and `vitest-e2e.config.ts`: Vite 8 switched the default TS/JS transformer from esbuild to Oxc; `unplugin-swc` disables esbuild internally, but without the new flag Oxc would still run in parallel and emit a deprecation warning on every test run. - Migration guide 11.25.x→11.26.0: new "Operational notes (no action required)" section covering the dotenv banner suppression, the vitest/Vite 8 `oxc: false` tip, and the Slot description rewrite. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ai): OAuth client binding, conversation list projection, createHttpUser cookie→JWT fallback + extended overrides Round of fixes from a multi-agent code review of the AI module. All 9 HIGH-severity findings addressed; one pre-existing test-flakiness in `tests/ai.e2e-spec.ts` fixed in the process. Security (MCP OAuth 2.1): - Bind refresh-token rotation to client_id (OAuth 2.1 §4.13.2 / §7.4) — prevents a stolen refresh token from being rotated by a different client and used to impersonate the original user. `rotateRefreshToken(token, clientId)` now atomically `findOneAndDelete({ token, clientId })`; `exchangeRefreshToken` rejects requests where `client.client_id` is missing or does not match. - Return `client_secret`/`client_secret_expires_at`/`token_endpoint_auth_method` from `getClient()` so the MCP SDK's `clientAuth` middleware can actually verify the secret. Previously the secret was persisted but dropped on read, silently downgrading confidential clients to public clients (the SDK's secret-verify branch is gated by `if (client.client_secret)`). Performance: - Add `index: true` to `CoreAiConversation.createdBy` — previously a collection scan for `findByOwner`. - Add `select: '-messages'` to `findAiConversations` / `findConversations` — conversation list previously over-fetched up to 500 messages per item. Public API typing: - Extend `ICoreModuleOverrides.ai` with `mcpClientService`, `modeService`, `placeholderRegistry`, `promptHintService`, `promptService`, `slotService`, `toolGrantService`, `toolPolicyService` (already accepted at runtime via `CoreAiModule.forRoot(...)`; TypeScript was blocking consumers from passing them). - Add typed `IAi.claudeCli?: { bin?, extraArgs?, maxBudgetUsd? }` — was already consumed by `ClaudeCliProvider` but untyped. - Sync the `.claude/rules/configurable-features.md` AI row to the post-rename surface (`aiSlots` / `/ai/slots` / `slotService`); previous text contradicted the README, INTEGRATION-CHECKLIST and migration guide. Tests: - 21 new unit tests covering the OAuth 2.1 store flow (single-use auth code, client_secret roundtrip, refresh-token rotation bound to client_id incl. the stolen-token-rejected case, `exchangeAuthorizationCode` and `exchangeRefreshToken` end-to-end through `buildOAuthProvider`, `challengeForAuthorizationCode`) and the MCP HTTP session lifecycle (`handlePost`/`handleGet`/`handleDelete` 401 with WWW-Authenticate, 404 for unknown sessions, `resolveUser` precedence chain, `evictIfNeeded` cap). - Fix flaky 401 in `createHttpUser` (pre-existing): BetterAuth's `api.signInEmail()` does not always echo a session token into the response body — in isolated test runs the body is `{ requiresTwoFactor, success, user }` without `token` or `session`. Session is still established via the `iam.session_token` Set-Cookie. `createHttpUser` now forwards that cookie to `betterAuthService.getToken({ headers: { cookie } })` (the JWT plugin path) to deterministically obtain a Bearer JWT — eliminates flaky 401s across the HTTP-layer tests in this file. Docs / housekeeping: - Add `load-tests/ai-smoke.k6.js` as the AI request-pipeline smoke baseline. - Persist reviewer agent memories for future review continuity. Verification: 3 consecutive `pnpm run check` runs green (1915 tests pass, build OK, server starts cleanly). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * docs(migration): document AI module fixes from the post-release review Add migration notes for the changes that landed in the previous commit: - Conversation list endpoint behaviour change (`select: '-messages'`) — clients that read `messages` from the list endpoint must switch to the per-id detail call. Surfaced both in "Operational notes" and as a ⚠️ Behaviour change row in Compatibility Notes. - MCP OAuth security fixes — refresh-token rotation is now bound to `client_id` (impersonation fix) and `getClient()` returns `client_secret` so confidential clients actually verify it. Subclassers must mirror the new `rotateRefreshToken(token, clientId)` signature and not strip `client_secret` from the returned shape. - Typed `ICoreModuleOverrides.ai` (now exposes all 18 collaborators) and `IAi.claudeCli` — `as any` casts can be dropped. Overview Bugfixes row + Compatibility Notes updated to point readers at the new "Operational notes" entries. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> * fix(ai): 5 bugs found during live mittwald-LLM browser test Live test against a fresh fullstack workspace (mittwald Ministral-3-14B-Instruct-2512) surfaced five bugs of varying severity. This commit fixes all of them and adds the missing operational docs. BUG-1 — HIGH: BSONError 500 from `loadRecentMessages` on bad conversationId - A client that…

kaihaase merged commit f5fe09f into main May 31, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 11.26.0#547

Release 11.26.0#547
kaihaase merged 1 commit into
mainfrom
develop

kaihaase commented May 31, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant