feat(ai): AI assistant module (11.26.0)#546
Merged
Merged
Conversation
…hestrator) Adds an extensible AI-assistant layer to the core: - DB-backed LLM connections (CoreAiConnection) with AES-256-GCM-encrypted API keys (admin CRUD; key never returned, only hasApiKey) - Provider abstraction (ILlmProvider) + fetch-based OpenAiCompatibleProvider (mittwald/OpenAI-compatible) + LlmProviderFactory - Global AiToolRegistry with role-filtered, self-registering tools (IAiTool/AiTool) - CoreAiService orchestrator with emulated tool calling (mittwald has no native tool calling), rate-limit + audit hooks - GraphQL resolver + REST controller (aiPrompt + connection CRUD) - CoreAiModule.forRoot (autoRegister + overrides), `ai` config in IServerOptions, CoreModule wiring, exports, FRAMEWORK-API regen - Example User tools + AiToolsModule in src/server - Docs (README, INTEGRATION-CHECKLIST, configurable-features) + AI-MODULE-PLAN.md - Tests: unit 7/7, e2e 6/6 (full suite 85/85) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- CoreAiInteraction model (aiInteractions, admin-only) + CoreAiInteractionService with system-internal record() - CoreAiService.audit() persists when ai.audit is enabled (optional injection, never breaks a prompt response) - Admin read endpoints (findAiInteractions/getAiInteraction) on resolver + controller - ai.audit config flag; model/service wired into CoreAiModule (+ override option) - Test: prompt persists an audit record (e2e 7/7) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ase 4) - CoreAiConversation model (aiConversations) + CoreAiMessage subdoc; owner-scoped via securityCheck (creator/admin only) - CoreAiConversationService with appendMessage() ($push, never round-trips the subdocument array through update()) - CoreAiService loads prior turns into the LLM context when conversationId is given and appends the user+assistant turns after the run - Owner-scoped CRUD endpoints (create/find/get/delete) on resolver + controller - Model/service wired into CoreAiModule (+ override option), exports - Test: 2-turn conversation keeps context and persists 4 messages (e2e 8/8) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- CoreAiService.promptStream(): emits action events, then the answer as token chunks, then a final event (reuses prompt() so history/audit/persistence apply) - AiStreamEvent type; POST /ai/stream SSE endpoint (raw @res, role-guarded) - Test: streaming yields action/token/final, tokens concatenate to the answer (unit 8/8, e2e 8/8) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- IAiTool.destructive flag; CoreAiPromptInput.confirm - CoreAiResponse.requiresConfirmation + pendingActions - Orchestrator halts on a destructive tool call until the prompt is re-sent with confirm: true (no execution, no conversation persistence while pending) - Example destructive tool delete_user (admin) added to the reference tools - Test: destructive tool blocked until confirmed, then executes (unit 9/9) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- CoreAiMcpService: role-filtered mcpListTools()/mcpCallTool() (testable logic) and lazy createServer() using the low-level MCP SDK Server with JSON-schema tool definitions (no zod conversion) - CoreAiMcpController: Streamable HTTP at /ai/mcp (POST/GET/DELETE), per-session McpServer bound to the authenticated user (Bearer via @currentuser), session map with eviction, MCP-style 401 + WWW-Authenticate - ai.mcp config flag; controller registered only when enabled; @modelcontextprotocol/sdk added (fixed 1.29.0, lazy-loaded) - Tests: MCP role gating + permitted/forbidden calls (unit 10/10); MCP 401 over HTTP + app boot with MCP (e2e 9/9) - Docs (README, configurable-features) + AI-MODULE-PLAN updated Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ompt e2e - Package types using the JSON scalar (AI models) leak into the global GraphQL type registry, so any schema build needs the JSON scalar provided. Real apps provide it via ServerModule; the framework-internal error-code-scenarios bare apps now provide it too (one JSON provider per app — no consumer impact). - Make the AI prompt e2e deterministic under parallel runs by exercising a registered test tool instead of find_users (shared User collection). Full e2e suite green: 61 files, 1770 tests. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…enrichment (Phases 9-12)
Phase 9 — Plan mode (Goals 1/2/5): input.mode='plan' produces a full plan, then
pre-flight authorizes ALL steps (registry role filter + optional IAiTool.authorize()
dry-run) BEFORE executing anything. If any step is not permitted, NOTHING runs and a
translated (de/en) error with deniedActions is returned. Otherwise steps run in order.
Phase 10 — Confirmation policy for mutating actions: IAiTool.mutating; ai.confirmation.mutating
{ default, enforced }; client override via input.requireConfirmation (ignored when enforced);
destructive tools always require confirmation. Applies to both auto and plan modes.
Phase 11 — Client metadata: input.metadata (URL, navigation, console logs) injected as a
clearly-delimited, size-capped, UNTRUSTED context message (prompt-injection hardening).
Phase 12 — Prompt enrichment: system prompt now includes the user's roles + available tools
and optional system documentation (ai.documentation / overridable getDocumentation()).
Refactor: prompt() dispatches to runAuto/runPlan via shared prepareRun; extracted helpers
(authorizeCall, confirmationRequiredFor, translate, appendClientContext, …).
CoreAiResponse gains plan/denied/deniedActions. Tests: unit 19/19, ai e2e 9/9.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- ai.budget { maxPromptsPerDay, maxTokensPerDay } enforced in prepareRun before any
LLM call; exceeding it aborts with HTTP 429 + translated message
- CoreAiInteractionService.usageSince() aggregates today's prompts/tokens per user
- ai.defaultMode lets admins default to plan mode; ai.confirmation/documentation typed
- Tests: budget block + under-budget pass (unit 21/21)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… 7b.1) - CoreAiMcpController resolves the user via req.user and, as a fallback, by verifying the Bearer token directly (BetterAuthTokenService) — so MCP works even though the S_EVERYONE guard does not populate the user - Full MCP protocol test over the SDK in-memory transport: initialize handshake → tools/list (role-filtered) → tools/call execution (unit 22/22) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…e 7b.2)
- CoreAiMcpOAuthService: HMAC-signed access tokens (constant-time verify), PKCE S256
verification, MongoDB-backed stores (clients/codes-TTL/refresh), loadUser, and
buildOAuthProvider() implementing the SDK OAuthServerProvider (clients store, PKCE
challenge lookup, code/refresh exchange, access-token verification)
- mountAiMcpOAuth(app) helper to mount mcpAuthRouter in main.ts (lazy SDK)
- MCP controller now also accepts OAuth access tokens when ai.mcp.oauth is enabled
- ai.mcp gains { oauth, oauthSecret }; interactive consent is overridable
(authorizeConsent) and documented for consumer main.ts integration
- Tests: token roundtrip/tamper/expiry + PKCE S256 (unit 26/26)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- README: plan mode, confirmation policy, client metadata, prompt enrichment, budget, and MCP OAuth 2.1 sections - INTEGRATION-CHECKLIST: advanced config + main.ts OAuth mounting step - configurable-features: full ai.* config reference - Remove AI-MODULE-PLAN.md (all backend phases done; full suite 1786 tests green) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ng (Phase 13)
- CoreAiBudgetLimit model (aiBudgetLimits, admin CRUD) for per-user/per-tenant
limit overrides; CoreAiBudgetService resolves override → ai.budget default →
unlimited (0/missing = unlimited). period day/month/none with resetAt.
- ai.budget restructured: { period, user:{maxTokens,maxPrompts}, tenant:{...} }.
- Enforcement (user OR tenant) before the run → HTTP 429 + translated message;
usage read via a read-only native count over aiInteractions (tenant filter).
- tenantId captured on aiInteractions (tenant plugin) for per-tenant accounting.
- Every response carries a compact `budget` summary (promptTokens, usedTokens,
remainingTokens, resetAt); full breakdown via aiUsage query / GET /ai/usage.
- Admin budget-limit CRUD + aiUsage endpoints (resolver + controller); module
wiring, exports, docs. Replaces the flat Phase-8 budget.
- Tests: unit (resolve/assert/summary/usage-info) + e2e (limit CRUD, response
budget, 429 enforcement, aiUsage). Full suite 1792 green.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…no vendor names)
- ILlmProvider declares LlmCapabilities { nativeTools, jsonResponse, systemPrompt }
(replaces supportsNativeTools); orchestrator compensates across all gradations.
- Capabilities configured per connection (supportsNativeTools, supportsJsonResponse);
OpenAiCompatibleProvider derives them and sends native tools / response_format only
when supported.
- Removed all concrete vendor/runtime names from code; docs neutralized to describe
the OpenAI-compatible API shape (protocol, not a vendor) + capability gradations.
- config.env example seed genericized (AI_BASE_URL/AI_API_KEY, only seeded when set).
- Tests + docs updated. Unit 31/31, ai e2e 10/10.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a prioritized, fully overridable connection-resolution chain so projects with multiple LLM connections can flexibly pick one per request. No connections → AI handling is disabled (translated "unavailable" response); exactly one → it is the implicit default; multiple → resolved via 8 ascending-priority layers. Resolution order (later overrides earlier): 1 global default (isDefault) — soft (must be available to tenant) 2 tenant default (preference) — soft 3 user default (preference) — soft 4 client selection (input) — soft 5 tenant-enforced (preference) — hard (mandatory, wins regardless) 6 admin-enforced global (flag) — hard 7 admin-enforced per tenant (flag) — hard 8 code override (serviceOptions) — hard (deliberate, trusted top layer) Each layer is an overridable protected method; resolutionLayers() can be reordered/replaced. Availability is restrictable per tenant via Connection tenantIds (empty = all tenants). New: - CoreAiConnectionResolverService (the chain; overridable per project) - CoreAiConnectionPreference model/input/service (tenant/user defaults + tenant-enforced, unique per (scope, refId)) - CoreAiAvailableConnection model (non-sensitive list with selected/locked flags) - Connection fields: tenantIds, enforced, enforcedTenantIds - Endpoints: aiAvailableConnections, aiSetUserConnection (S_USER self-service), admin preference CRUD (GraphQL + REST under /ai/connections/*) Orchestrator now resolves the connection via the chain (falls back to the plain connection service when no resolver is wired), returns a denied response when no connection is usable, and honors the serviceOptions._aiConnectionId code override. Module wiring, ICoreModuleOverrides.ai (connectionResolver, preferenceService), barrel + top-level exports, README/INTEGRATION-CHECKLIST/configurable-features docs, and FRAMEWORK-API.md updated. Tests: 14 unit (resolution chain incl. each layer, availability filtering, one=default, none=disabled, subclass override) + 5 e2e (per-tenant restriction, self-service validation, tenant-enforced lock, disabled response, endpoint auth). Full suite green (1811 tests). Removes the temporary AI-MODULE-PLAN.md. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The MCP SDK transport exposes `onclose` as a callback property, not a DOM EventTarget, so addEventListener does not apply. Add a targeted inline disable. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Four follow-up optimizations to the connection-resolution chain: 1. Robustness — the chain now drops a selection that points to a deleted/disabled connection (orphaned enforced preference or stale code override) with a warn log and degrades to the fallback, instead of returning a dead id that made connectionService.resolve() throw a 404 mid-prompt. 2. Admin validation — CoreAiConnectionResolverService.setPreference() verifies the connection exists and is usable before persisting; the admin GraphQL/REST preference endpoints route through it (fail-fast instead of a dangling preference). 3. Performance — tenantDefault (layer 2) and tenantEnforced (layer 5) now share a single tenant-preference DB read per resolution (memoized on the ctx via WeakMap). 4. Cleanup — deleting a connection removes preferences pointing to it (PreferenceService.deleteByConnectionId + ConnectionService.delete override with an @optional preference service; best-effort, never fails the delete). No DI cycle: resolver → {connection, preference}; connection → preference; preference → none. Tests: +4 unit (P1 stale enforced + stale code override degrade, P2 setPreference validation, P3 single tenant query) and +2 e2e (P4 preference cleanup on delete, admin setPreference validation). Full suite green (1817 tests). README updated. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Implements all optimizations from the multi-agent review of feature/ai-module.
Performance:
- getUsage() now sums prompts/tokens server-side via a $group aggregation instead
of loading every aiInteractions doc per prompt
- Added compound indexes { userId, createdAt } and { tenantId, createdAt } on
aiInteractions for the budget period query
- buildSummary() skips the usage aggregation for unlimited users (no finite limit)
- Conversation history loads via a lean, projected, $slice-capped read
(loadRecentMessages) instead of a hydrated get() running the full process()
pipeline over the whole messages array
- appendMessage() caps the messages array with $slice (-500)
Security:
- Adopted the ErrorCode registry for all thrown AI exceptions; added the
LTNS_0600–LTNS_0608 AI error codes (de/en translated)
- Added enum validation (IsIn) for mode / scope / period and @maxlength on prompt
- CoreAiConnectionPreference model tightened from S_USER to ADMIN
- CheckSecurityInterceptor now MERGES (union) project secretFields with the
framework defaults so a custom list can't drop password/apiKeyEncrypted/etc.
- Optional SSRF allowlist ai.allowedBaseUrlHosts (opt-in; local providers like
Ollama still work by default)
- ai-crypto getKey() and provider mapNativeToolCalls() are now protected (overridable)
Docs:
- Removed all residual vendor names from shipped code (provider-agnostic)
- Reconciled the AI override fields across ICoreModuleOverrides.ai (now 10),
configurable-features.md, and the README override table
- .env.example: added AI env vars incl. NSC__AI__ENCRYPTION_SECRET
- FRAMEWORK-API generator now emits IAi / IAiRateLimit / IAiDefaultConnection
- REQUEST-LIFECYCLE.md documents the AI/MCP/SSE/OAuth entry points
- New migration guide migration-guides/11.25.x-to-11.26.0.md
- Fixed the MCP controller JSDoc (resolveUser, not @currentuser) + provider chat() JSDoc
Tests:
- Unit: OpenAiCompatibleProvider (capabilities, mapNativeToolCalls, transport/HTTP
error mapping, allowlist enforcement) + OAuth buildOAuthProvider wiring
- E2e: HTTP secret-exposure (apiKeyEncrypted never returned), admin 403 for a
non-admin, authenticated self-service 200, SSE framing, shipped get_user tool
ownership (S_SELF) and admin-only delete_user visibility
Full `pnpm run check` green (audit, format, lint, tests, build, server start).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Auto-detection of JSON / native-tool support (provider-agnostic). Connection flags supportsJsonResponse / supportsNativeTools are now OPTIONAL: undefined = auto-detect, explicit true/false = authoritative (never probed). Removed their mongoose default so "not set" stays distinguishable. Detection (A + B, sharing one probe path): - A (eager): on create with an undefined flag, the endpoint is probed once and the result persisted. Best-effort — never fails the create. - B (lazy): if a flag is still undefined at prompt time, the orchestrator probes once and persists; until then the safe emulated baseline applies. - On demand: admin detectAiConnectionCapabilities / POST /ai/connections/:id/detect-capabilities. Probe (OpenAiCompatibleProvider.detectCapabilities, optional on ILlmProvider): response_format: json_object (2xx → JSON) and a trivial tool with tool_choice:'required' (2xx WITH tool_calls → native tools; 4xx / silent-ignore → unsupported). config.env seed only sets the flags when AI_SUPPORTS_* env vars are provided. Review remediation catalog (all 6): 1. GraphQL resolver tests (admin findAiConnections happy-path + non-admin Access-denied) 2. get_user ownership denial now asserts a typed 401/403 error (was rejects.toBeDefined) 3. Dropped the redundant single-field index on aiInteractions.userId (compound covers it) 4. Removed the duplicate IsNotEmpty() on prompt (auto-applied for required fields) 5. assertWithinBudget(language) documented as deprecated/unused (kept for compat) 6. Added a mountAiMcpOAuth mount-wiring unit test Tests: +7 unit (detectCapabilities x3, mount, GraphQL coverage) and +6 e2e (eager/ explicit/endpoint/403 detection, GraphQL admin+denial). README, IAiDefaultConnection JSDoc, and the migration guide document auto-detection. Full `pnpm run check` green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Durable knowledge captured by the lt-dev review agents during the AI-module reviews (backend, docs, performance, security, test reviewers): e.g. AI secret stripping, FRAMEWORK-API generator allowlist, and the doc surfaces a configurable feature must touch. Consistent with the already-tracked agent-memory structure. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…tion In emulated tool calling (providers without native function calling), weak models sometimes reply with a natural-language "action done" message without ever emitting the tool_calls request — so the tool never actually runs and the confirmation gate is bypassed (a false positive, never a real side-effect: execution can only happen via executeToolCall() + the gate). Add an explicit instruction to the emulated tool protocol: never claim to have executed/performed/deleted/updated/created anything without first emitting a tool_calls request and receiving its TOOL_RESULTS. Document the residual limitation (and the intact security guarantee) in the module README, recommending native tool-calling backends for action-heavy workflows. Observed during real fullstack E2E testing against a live OpenAI-compatible endpoint without native tool support. Full e2e suite green (1839 tests). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Format-only follow-up to 786aee3 — normalize markdown emphasis (oxfmt). No content change. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…kend)
Proves the module connects to all three backend kinds — external (hosted
OpenAI-compatible), local (e.g. Ollama via the same provider) and the local
Claude Code CLI — via the existing ILlmProvider extension point.
ClaudeCliProvider shells out to `claude -p --output-format json` and parses the
single result + usage. Security model:
- `--tools ""` disables ALL of Claude Code's own tools — it is a pure text
generator, so it cannot read files, run shell commands, or reach the network on
its own. Tool calling is emulated (the orchestrator executes tools itself via
CrudService with the caller's permissions).
- `spawn` with an argument array (never a shell) — prompt content can't be
interpreted as a command. Conversation is piped via stdin.
- Runs in tmpdir (no CLAUDE.md/settings auto-discovery), `--system-prompt`
replaces Claude Code's default agent prompt, `--no-session-persistence`, and a
timeout that kills the child on overrun. Optional `ai.claudeCli` config
(bin/extraArgs/maxBudgetUsd); `apiKey` is forwarded as ANTHROPIC_API_KEY.
Opt-in (not auto-registered): register via
`factory.registerBuilder('claude-cli', (c) => new ClaudeCliProvider(c))`.
Exported from the AI module barrel + top-level index. README documents external /
local / CLI connection recipes. +7 unit tests (capabilities, tool-free argv,
transcript flattening, result/usage parse, error + transport mapping, factory
wiring); live-verified against the real CLI. Full e2e suite green (1846 tests).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
In emulated tool calling a (weaker / JSON-mode) model sometimes replies with a bare
protocol wrapper like `{"tool_calls":[]}` instead of a user answer. The orchestrator
treated that as the final text and surfaced the raw JSON to the user.
Now, when the response parses to a protocol-shaped object with no `final`:
- nudge the model once ("reply with your final answer as {\"final\":\"…\"}") and
continue the loop, and
- if it still returns only a `tool_calls`/`final` wrapper, drop it so the generic
"could not produce a final answer" fallback applies instead of leaking JSON.
Found during real fullstack E2E with a local Ollama model (qwen2.5) in JSON mode.
+2 unit tests (nudge recovers a final answer; bare wrapper never surfaces). Full
e2e suite green (1848 tests).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…tale keys Hardening to match the existing email/cookie production guards. Previously a missing AI encryption secret only logged a warning and silently used a public, insecure development default — you could ship insecure key storage to production. - AiCryptoService.assertProductionSafe() (run in onModuleInit): throws at boot in production/staging when AI is active but no secret is resolvable (ai.encryptionSecret / NSC__AI__ENCRYPTION_SECRET / SECRETS_ENCRYPTION_KEY). Non-prod keeps the warn-only dev default. getKey() now shares resolveSecret(). - CoreAiMcpOAuthService.assertProductionSafe() (onModuleInit): same guard for the OAuth signing secret, but only when ai.mcp.oauth is enabled. - CoreAiConnectionService: the defaultConnection seed now WARNS when configured incompletely (missing baseUrl/model/name) instead of skipping silently; plus a best-effort boot self-check that logs which connections' stored API keys can no longer be decrypted (e.g. after a secret rotation) — actionable at boot instead of failing deep in a request. Never throws. Behavior change (warn → throw in prod/staging): documented in the AI INTEGRATION-CHECKLIST and the 11.26.0 migration guide. +2 unit tests (both guards across prod/staging/local × secret/no-secret × oauth on/off). Full e2e green (1850). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ling text
Emulated tool-call extraction used a naive first-`{` … last-`}` slice. A model that
keeps writing after its JSON — e.g. the Claude CLI emits a valid
`{"tool_calls":[…]}` and then a self-hallucinated `TOOL_RESULTS:` block in the same
turn — produced an unparseable slice, so the tool call was missed entirely: the tool
never ran and the raw JSON leaked as the final answer. This broke function calling on
any backend that doesn't stop cleanly after the JSON.
- extractJsonObject() now prefers the first *brace-balanced* object (string-aware),
falling back to the old slice. New protected firstBalancedJson() helper.
- Emulated tool protocol now instructs the model to STOP after tool_calls and not
write the results itself.
Verified end-to-end through the orchestrator with the real Claude CLI: server_time is
now actually executed (2 iterations, correct final answer) where it previously failed.
+1 unit test reproducing the trailing-text case. Full e2e green (1851).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
After detecting emulated tool calls, the orchestrator pushed the model's raw
completion text as the assistant turn. When a model appends a self-hallucinated
`TOOL_RESULTS:` block after its tool_calls (observed with the Claude CLI), that fake
block was fed back alongside the REAL results, confusing the model — its final answer
became a meta-complaint ("these TOOL_RESULTS were not requested by me…") instead of a
clean summary, even though the tools executed correctly.
Now record a normalized assistant turn containing only `{"tool_calls":[…]}` (name +
arguments), never the raw text. Verified end-to-end via Chrome MCP across all three
backends (external/mittwald, local/Ollama, Claude CLI): multi-tool calls + result
processing now produce clean answers, and the confirm-before-execute gate works on
each. +1 unit test (assistant turn never carries hallucinated trailing text). Full
e2e green (1852).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…oop, context-window handling
Makes prompt construction transparent, adaptive and self-improving so non-technical
users only enter domain prompts while the system supplies everything the LLM needs —
across all backends (external/local/CLI) — and always within the security model.
- DB-editable prompt store (CoreAiPromptTemplate, admin CRUD): the system prompt is
assembled from keyed fragments (base, permissions, anti_hallucination,
output_contract, tool protocol, error_guidance, learned_hints, plan_protocol). Ships
rich built-in defaults; DB rows override per key (locale/capability scoped). No prompt
text is hard-coded-and-unreachable.
- Rich auto-enrichment in CoreAiPromptBuilderService (now template-driven + async):
roles, exact allowed tools, tool catalog with parameter schemas, anti-hallucination
contract ("never invent/guess; use a tool or say you don't know"), structured output
contract, error guidance — placeholders rendered at build time.
- Structured tool errors fed back to the model ({ code, message, hint }) so it can
self-correct instead of pretending success.
- Governed self-improvement loop (CoreAiPromptHint + CoreAiPromptHintService):
orchestrator records failure signals (tool_error/exception/not_available); recurring
patterns become learned hints. Default governed (admin approves); ai.promptLearning
{ enabled, autoApply, minOccurrences } can auto-apply. Hints only ADD guidance — never
relax permissions.
- Per user/session context-window handling: contextWindow per connection (+ ai.contextWindow
fallback 8192); fitMessagesToContext trims the oldest session-history turns (keeping the
system prompt + current input), truncates as a last resort, and caps oversized tool
results (ai.maxToolResultChars). Applied in auto + plan modes.
- New override hooks (promptTemplateService, promptHintService), barrel + config exports,
IAi config additions. +9 unit tests. Full e2e green (1858).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…xt window Phase 5/6 of the self-optimizing prompt system: - Admin endpoints (GraphQL + REST, all @roles(ADMIN)) to CRUD prompt-template fragments and the learned prompt hints — so admins can edit every prompt text and review/approve/reject the governed learning loop. - Context-window size is now auto-detected per LLM: OpenAiCompatibleProvider probes a local Ollama /api/show then falls back to a known-model table; ClaudeCliProvider returns the model-appropriate size (200k, 1M for 1m variants). detectAndPersistCapabilities() persists the detected window alongside the capability flags; the orchestrator's lazy-detect guard also fires when the window is still unknown. - Tests: +unit (detectContextWindow for known models + claude alias) and +e2e (template override applies to the prompt; learning loop end-to-end with a throwing tool → suggested hint → admin approve → hint in prompt; admin-only prompt-template/hint queries + non-admin denial). Full suite green (61 files, 1863 e2e; 80 ai unit). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Demonstrates the opt-in LlmProviderFactory.registerBuilder() pattern in the reference implementation so a connection with providerType 'claude-cli' works out of the box in the framework's own server. No effect on existing tests (no test creates a claude-cli connection); full ai e2e green (32). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…acting
For non-technical end users a focused clarifying question often beats a wrong
action. The model can now call the built-in `ask_user_question` tool to pause
the run and ask the user (with optional multiple-choice options) instead of
guessing.
The orchestrator detects the tool's sentinel return shape, breaks the loop,
and surfaces `CoreAiResponse.pendingQuestion = { question, options? }`. The
client renders it and sends the user's answer as the next prompt — the
conversation continues naturally, no special response field needed.
The tool is auto-registered with `roles: [S_USER]`, non-mutating; permission
restrictions are still enforced backend-side regardless of how the prompt is
phrased. +1 unit test verifying short-circuit + payload.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ating tools)
End users should not have to re-confirm the same mutating action repeatedly
when they have already granted consent. Adds `aiToolGrants` (admin CRUD) and
a new `rememberDecision` field on `CoreAiPromptInput`:
- When the user confirms (`confirm: true`) AND sets `rememberDecision` to
'conversation', 'user' or 'tenant', the orchestrator persists a grant for
that scope after a successful, non-destructive mutating execution.
- On future iterations, the confirmation gate consults the grant store: if an
active, non-expired grant covers the tool in any scope (user / tenant /
conversation), the gate is skipped for that call. Destructive tools NEVER
use grants — they always confirm.
- Grants only skip the confirmation gate. The permission model itself
(`@Restricted`, `@Roles`, `authorize()`) is enforced backend-side
regardless.
Override the store via `CoreModule.forRoot(env, { ai: { toolGrantService } })`.
+2 unit tests verifying skip-on-grant + persist-on-remember.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…Stop)
Adds a generic hook registry so projects can register compliance-, audit- and
sanitization callbacks at well-defined points of the agent loop without forking
the orchestrator. Hooks are NestJS providers extending AiHookBase that
self-register in the global AiHookRegistry on module init.
- `preToolUse(call, tool, event)`: can `{block: true}` the call (returned as a
structured BLOCKED_BY_HOOK error to the LLM) or replace `args` (chained
sanitization across hooks).
- `postToolUse(call, tool, {result, success}, event)`: pure notification — for
webhooks, metrics, audit shipping.
- `sessionStart(event)`: called once before the first LLM call.
- `stop(response, event)`: called once at the end of every run.
Errors thrown inside hooks are swallowed + warn-logged — a misbehaving hook
must never crash a prompt. Hooks can only ADD restrictions; they cannot relax
the existing permission model (@Restricted/@Roles/authorize() still apply).
+2 unit tests: blocking hook stops execution + observe hook receives notify.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a second permission layer underneath the role-based tool gating: admins can author fine-grained allow / ask / deny rules against the arguments of a tool call without needing to write code. Example: a generic dbQuery tool can be opened up to a support role with: - allow `sql` matching `^SELECT\\b` - deny `sql` matching `(?i)\\b(DROP|TRUNCATE|DELETE FROM)\\b` - ask `sql` matching `(?i)\\bUPDATE\\b` Evaluation across all matching policies follows the precedence deny > ask > allow so a deny anywhere always wins; an ask routes the call through the existing confirmation gate even if the tool itself isn't `mutating`. A pure allow lets the call proceed without re-gating. No matching policy → fall through to the existing behaviour. Policies stack across scopes: `tool` (any user), `role`, `tenant`, `user`. Like all other layers, policies can only TIGHTEN — they never relax the underlying permission model (@restricted / @roles / authorize() still apply). +2 unit tests verifying deny aborts + ask forces the confirmation gate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…Use hook test to avoid config merge leakage from prior tests
With many tools, the JSON-Schema catalog can dominate the system prompt and cost a large chunk of every request. `ai.deferToolSchemas: true` switches the prompt to a NAMES + descriptions only listing; the LLM uses the new built-in `search_tools` meta-tool to fetch a specific tool's parameter schema on demand. Result: massively smaller default context footprint for projects with rich tool registries, with no loss of capability for the model. `search_tools` is role-gated and returns only the tools the current user can already see (defense in depth on top of the orchestrator's role filter). TDD: +2 unit tests (defer-on emits names + hint, schemas hidden; defer-off keeps the full catalog as before). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
When a session would overflow the model's context window, the orchestrator now summarizes the oldest non-system / non-last turns into a single short system message via the same connection's provider — instead of dropping them. The result: long sessions stay coherent (cross-turn intent preserved) instead of losing context to hard truncation. Falls back to the hard trim path on any error. Configurable: `ai.compaction: false` to disable. TDD: +1 unit test with a fake provider returning a known summary verifies the oldest turns get replaced by the summary and the system + current prompt are preserved. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
A `CoreAiMode` row bundles a curated tool whitelist, optional model override, optional role gating and an optional prompt addendum under a name like `support`, `audit`, `billing`. End-user prompts activate one with `agentMode: 'support'` — the orchestrator then narrows the available tool set to the whitelist (built-in ask_user_question + search_tools always stay available — they are essential for end-user UX). Modes only ever TIGHTEN; they cannot relax the underlying permission model. Combines with the existing prompt-template `mode:<name>` scope filter (#2): admins can author per-mode prompt fragments that fire only when that mode is active. TDD: +1 unit test — a model in `support` mode tries to call a non-whitelisted tool and gets the standard TOOL_NOT_AVAILABLE response. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
End users can now attach images / files to a prompt via `input.attachments`
(array of `{mimeType, dataUrl|url, name?}`). The orchestrator forwards them on
the user message; OpenAiCompatibleProvider translates them to the OpenAI
content-parts shape (`[{type:'text'},{type:'image_url'}]`) — sent to vision-
capable backends (mittwald-Ministral, OpenAI, Anthropic via OpenAI-compat
gateways, Ollama vision models). Providers without vision support naturally
ignore them.
Adds LlmAttachment interface + `LlmMessage.attachments` so other providers
can adopt the mapping in one place. Powers product flows like "send a
screenshot + describe the bug".
TDD: +1 unit test verifies attachments survive from CoreAiPromptInput to the
provider on the user message.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`CoreAiMcpClientService.registerExternalClient({ name, client })` takes an
MCP-like client (the SDK `Client`, or a duck-typed one), calls `listTools()`,
and registers each as a wrapper tool in the global AiToolRegistry under the
namespaced name `<clientName>_<toolName>`. The wrapper dispatches `execute()`
back to the MCP server's `callTool` and normalises the response into our
{success, data, message} shape.
Imported tools are role-gated like any other tool and ship with the
conservative defaults `mutating: true` + `roles: [S_USER]` — so they always
require confirmation unless an admin scoped-policy (or grant) relaxes that.
Projects bring their own MCP client (stdio spawn / StreamableHTTP / SSE /
OAuth) — this service is the glue that turns it into a first-class set of
backend tools. Override the service for custom connection logic.
TDD: +1 unit test — a fake MCP client returns one tool, we register it, and
verify the wrapper dispatches `callTool` correctly and surfaces the result.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…n responses
Two new payload fields drive the chat UI's usage indicators:
1. `CoreAiBudgetSummary.maxTokens` + `scope` ('user' | 'tenant' | 'llm'):
resolved as user-limit → tenant-limit → LLM context-window. Lets the client
render a usage progress bar against the effective ceiling for THIS user.
Null = no limit at all (frontend hides the bar).
2. `CoreAiResponse.contextWindow = { used, total }`: current session token
utilization vs. the connection's auto-detected context window. Powers a
"closing-circle" indicator (Claude Code-style) that appears once the model
is approaching its context limit.
Both work even without a user (LLM-window fallback) and even without the
budget service (no budget = LLM fallback only).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…obal visibility
A user-facing companion to the admin-only CoreAiPromptTemplate: any signed-in
user can author short, named prompt snippets that they can insert into the
chat input with one click. Different from the system-prompt building blocks,
these are USER prompts (e.g. "Schreibe eine kurze Antwort zu …").
Visibility scopes (enforced server-side; the picker only sees what it's
allowed to see):
- 'user' — only the owner sees it (default).
- 'tenant' — all members of the owner's tenant see it (tenant context required).
- 'global' — every signed-in user sees it (creation requires the ADMIN role).
Owner-only mutations: only the creator can update/delete; admins still pass
via the standard admin pipeline. The full security model:
- listVisible(): server-side $or query (own + global + tenant), filtered to
enabled snippets, ordered by `order` asc then `name`.
- create(): runs the standard pipeline first (validation + per-input
whitelist), then writes the system-owned ownerId/scope/tenantId via a
direct update — the input DTO deliberately doesn't expose those fields,
so `prepareInput` would strip them if we passed them through super.create.
- update(): assertOwner first, strip ownerId/tenantId from input,
validate scope changes (incl. admin gate for 'global'), then persist
tenantId directly when scope changed.
- securityCheck(): per-row filter that returns undefined for snippets the
caller is not allowed to see (own + tenant-member + global pass).
- assertOwner(): ADMIN bypass; otherwise compares row.ownerId to user.id.
Module wiring: new options field promptSnippetService, MongooseModule.forFeature
schema entry, AI_PROMPT_SNIPPET_CLASS + service token, provider registration,
barrel exports. The aiPromptSnippets collection has a unique compound index
{ ownerId: 1, name: 1 } so a user can't shadow their own snippet by accident.
Endpoints (all S_USER):
REST GET /ai/snippets
POST /ai/snippets
PUT /ai/snippets/:id
DELETE /ai/snippets/:id
GraphQL findAiPromptSnippets
createAiPromptSnippet
updateAiPromptSnippet
deleteAiPromptSnippet
Tests: +5 e2e covering scope filtering (own/tenant/global), admin-only global
creation, tenant-context requirement, owner-only mutations, and the REST +
GraphQL surface. Full suite green (1883 tests).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…al' prompt scope
Phase 1 of the AI-naming overhaul. Renames the two prompt-side stores to terms
that match how they're actually used in the UI and conversations:
* CoreAiPromptTemplate → CoreAiSlot
The admin-edited building blocks of the SYSTEM prompt. Each row fills a
keyed slot (`base`, `permissions`, `anti_hallucination`, `tool_catalog`,
`output_contract`, `tool_protocol_emulated`, `error_guidance`, …) and
overrides the framework default for that key. The model already used
"slot" in its description text — the file/class/collection names now
match. Adds an auto-set `tenantId` field so Phase 2 can introduce
tenant-scoped overrides without another schema bump.
* CoreAiPromptSnippet → CoreAiPrompt
User-facing re-usable prompts ("Vorlagen") — short, named user-prompt
presets that can be inserted into the chat input with one click. No
longer to be confused with the system-prompt building blocks above.
The existing CoreAiPromptInput (= the payload of the aiPrompt mutation, i.e.
the user's question to the LLM) keeps its name; the CRUD inputs of the new
model are CoreAiPromptCreateInput / CoreAiPromptUpdateInput.
Endpoints and GraphQL operations follow:
* /ai/prompt-templates → /ai/slots
findAi/createAi/updateAi/deleteAiPromptTemplate(s)
→ findAi/createAi/updateAi/deleteAiSlot(s)
* /ai/snippets → /ai/prompts
findAi/createAi/updateAi/deleteAiPromptSnippet(s)
→ findAi/createAi/updateAi/deleteAiPrompt(s)
The user-facing prompt scope drops `'global'`. The Backend used to allow
admins to create system-wide prompts; admins now have a proper tenant-scoped
system-prompt extension mechanism via Slots instead. Remaining valid scopes
are `'user'` (private, default) and `'tenant'` (public within the tenant).
Backend rejects `scope: 'global'` for everyone — including admins.
Collections (aiPromptTemplates, aiPromptSnippets) are renamed to (aiSlots,
aiPrompts). The AI module hasn't been used in production yet, so no DB
migration is shipped — fresh installs start clean.
Module override field renamed from `promptTemplateService`/`promptSnippetService`
to `slotService`/`promptService`. Module exports updated.
37/37 ai e2e tests green. Full `pnpm run check` green.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…er registry
Phase 2 + 3 of the AI naming overhaul.
Phase 2 — Slot tenant-scoping + system-default overrides
========================================================
`CoreAiSlot` now carries a `tenantId` that's set system-side from the calling
admin's tenant (RequestContext + serviceOptions). Multi-tenant deployments
get per-tenant overrides for free; single-tenant deployments stay
system-wide (tenantId stays undefined).
`CoreAiSlotService`:
- create / update / delete now require ADMIN and verify the row belongs to
the caller's tenant; tenantId is set system-side and stripped from input.
- `resolveFragments(...)` filters DB rows by the request's tenantId and now
also honors disabled (soft-deleted) system-default overrides — a row with
`enabled: false` matching a system key HIDES the default for that tenant.
- New `listEffective(...)` returns the framework defaults overlaid by the
tenant's rows, each with `isSystem` / `isOverride` flags so the admin UI
can render the right action (Bearbeiten / Zurücksetzen / Deaktivieren /
Löschen).
- New `resetSystemSlot(id, ...)` — deletes a tenant override row → the
framework default applies again. Refuses to operate on custom slots
(use `delete` for those).
The default-fragments definition was extracted to a module function
`getSystemDefaultSlots()` so both the prompt builder and the slot service
can reach it without a circular dependency. The builder's
`defaultFragments()` is now a thin wrapper that combines the system
defaults with the configured `ai.systemPrompt`.
New endpoints:
- `GET /ai/slots/effective` — system defaults + overrides + customs (admin).
- `POST /ai/slots/:id/reset` — reset a tenant override (admin).
Phase 3 — Runtime placeholder registry
======================================
New `CoreAiPlaceholderRegistry` service. The 6 system placeholders
(`roles`, `tools`, `toolCatalog`, `documentation`, `learnedHints`,
`userId`) are registered at boot; projects can add their own via
`register({ name, description, resolve })` from any provider.
Why a registry instead of hard-coded names in the frontend:
* The frontend loads the list via the new endpoint, so admin / user
editors see EVERY currently-supported placeholder (including project-
specific ones) without a frontend change.
* Resolvers stay in TypeScript — no eval, no DB-stored function bodies,
no admin-defined runtime code paths. Secure by construction.
The prompt builder's `renderContext()` now delegates to the registry when
available (falls back to the hard-coded record for legacy paths).
New endpoint: `GET /ai/placeholders` (S_USER — placeholders aren't secrets,
the resolver implementations stay backend-side).
Tests: 43/43 ai e2e green (was 37/37). Added:
- listEffective returns 10 system defaults on a clean tenant
- admin override + reset round-trip
- non-admin → ForbiddenException on slot reads/writes
- placeholder registry lists the 6 system placeholders
- non-admin user can hit `/ai/placeholders` (200)
- project-registered placeholder is honored end-to-end
Module override fields added: `placeholderRegistry`. Index barrel exports
the new interface + service.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…+ docs
Four follow-up optimizations on the AI naming overhaul + tenant-scoped slots
+ placeholder registry:
1. **Idempotent slot create** — `CoreAiSlotService.create()` now upserts on
`(tenantId, key)`: a second "Override anlegen" / "Deaktivieren" action on
the same system slot UPDATES the existing row instead of inserting a
duplicate. Live-verified via Chrome MCP — two overrides of `base`
produce a single DB row with the latest content; the row's `_id` stays
stable across edits.
2. **User-prompt placeholder resolution** — `CoreAiService.prompt()` now
runs a new `resolvePromptPlaceholders(input, serviceOptions)` pass
BEFORE prepareRun, replacing `{{placeholder}}` tokens in the user's
prompt text using the runtime placeholder registry. A stored prompt
template like "Erkläre dem Nutzer mit ID {{userId}} …" now gets the
real id substituted at run time. Unknown tokens are left as-is so
plain text with curly braces survives. `promptStream()` inherits the
resolution because it delegates to `prompt()`.
3. **assertSameTenant performance** — projects on `tenantId` only when
verifying tenant ownership (cheaper than loading the full document).
4. **Documentation** — README, INTEGRATION-CHECKLIST and the 11.25.x →
11.26.0 migration guide now use the new Slot / Prompt / placeholder-
registry terminology. The migration guide gains an "AI naming overhaul"
section documenting the rename for projects that experimented with
pre-release AI builds (cleanup steps: copy data from `aiPromptTemplates`
→ `aiSlots`, `aiPromptSnippets` → `aiPrompts`; routes / module
override names changed; user-prompt scope `'global'` removed).
Tests: +3 e2e (46/46 total ai e2e green) — covers (a) idempotent override,
(b) user-prompt placeholder substitution with real values, (c) unknown
tokens preserved.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
A fresh AI module on a new project answers "I don't have a tool to do that"
for every domain question — that's correct behaviour (LLM is untrusted, the
registry starts empty), but new integrators (humans and AI agents) keep
reporting it as a bug. The docs now spell this out wherever someone might
land first:
- **CLAUDE.md** gains an "AI Module: Tools Are Opt-In" section right under
Core Principles so AI agents working with the framework see it before
starting any AI integration.
- **README.md** (AI module) gains a prominent callout above the existing
"## Tools" section explaining the no-auto-discovery contract, the
three-step path (write tool subclass / group in AiToolsModule / import
in ServerModule), the four reference tools, and the security contract
(CrudService + serviceOptions; never direct Model.find()).
- **INTEGRATION-CHECKLIST.md** §3 + §4 now explicitly flag both as
REQUIRED for the assistant to do anything domain-specific, and walk
through the copy-from-framework step with the exact import rewrites
('../../../../core/...' → '@lenne.tech/nest-server', user.service path
adjustment) plus the boot-still-works fallback.
No behavioural change.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ection
The token bar showed only the last request's tokens against the LLM context
window when no user/tenant budget was configured. That's neither a real
"used" value (it's per-request, not per-period) nor a real limit (the
context window is per-call, not per-day). Two changes:
1. **Cumulative usage even at scope='llm'.** `buildSummary` now always
aggregates the user's running per-period token total (via
`getUsage({ userId }, period)`) — same query that backs the hard-budget
path. The fallback hierarchy is unchanged:
user override > tenant override > config defaults > provider quota > context window
…but `summary.usedTokens` is now filled at EVERY level, not just for
hard limits. The UI's token bar can therefore show a real "verbraucht
im aktuellen Zeitraum" counter against whatever soft fallback applies.
2. **Provider-quota fallback on the connection.** Two new admin fields
on `CoreAiConnection`:
defaultUserMaxTokens?: number // e.g. 50000 (= soft cap per user)
defaultUserMaxPeriod?: string // 'day' (default) | 'month' | 'none'
When set, the connection's soft user quota is used as `maxTokens`
(still surfaced as scope='llm' so the UI knows it isn't a 429-able
hard limit). This is the technical landing spot for the user's "vom
LLM-Anbieter ermittelt" requirement — the admin pins the provider's
per-user quota once per connection and the UI immediately reflects it.
The resolved-connection interface gains the two fields so the orchestrator
can pass them through to `buildSummary`; the connection input DTO accepts
them for create/update.
UI (in nuxt-base-starter): the token bar now reads `usedTokens` for every
scope (no more "last request" special case), the scope label for `'llm'`
becomes "Anbieter-Quota (weich)", and the tooltip says "Kumulativ (weiches
Limit)" so users understand the bar fills over the period.
Tests: +2 e2e (48/48 ai e2e total green) — covers cumulative aggregation
under the context-window fallback and the new provider-quota fallback.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… placeholder registry Version bump to ship the AI-module work that has been collecting on feature/ai-module. The complete catalog of changes lives in `migration-guides/11.25.x-to-11.26.0.md`, which the same commit extends with the missing follow-up sections (user-prompt placeholder resolution, cumulative budget summary at every scope, the new `defaultUserMaxTokens` / `defaultUserMaxPeriod` provider quota on `CoreAiConnection`, and an explicit "Tools are opt-in" callout for new integrators). No breaking changes for projects that don't opt in to the AI module (absent `ai` config block → module stays disabled). For projects that experimented with the pre-release AI build, the "AI naming overhaul" section of the migration guide lists every rename (`PromptTemplate → Slot`, `PromptSnippet → Prompt`, the `'global'` user-prompt scope removal, etc.). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rnings
Spectaql renders GraphQL descriptions through Handlebars and turned the AI
module's `{{placeholders}}` notation into a hard `Unsupported interpolation`
error. That error broke the `docs:bootstrap` script chain — spectaql crashed
before writing its tmp index.html, which then surfaced as a cascade of
follow-up ENOENT errors. The `{{placeholders}}` notation in slot / prompt
content descriptions is intentional: it documents the runtime placeholder
registry consumed by `CoreAiPromptBuilderService`, not a Handlebars binding.
Sets `spectaql.errorOnInterpolationReferenceNotFound: false` in `spectaql.yml`
(must sit under the `spectaql:` block, not at root — verified against
spectaql 3.0.9 source / examples). Unresolved interpolations now log as
`WARNING: Unsupported interpolation encountered: "{{placeholders}}"` and the
docs build runs to completion.
Verified: `pnpm run docs:bootstrap` exits 0; `public/index.html` is
regenerated; the ENOENT cascade in `server.log` is gone.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`dotenv` v17.4.2 emits a "◇ injected env (N) from .env // tip: …" line on every `config()` call. The tip is randomly chosen from a hard-coded TIPS array that includes promotional links to other Motdotla products (www.dotenvx.com, www.vestauth.com). Since the framework already has structured logging via NestJS, the banner is pure noise. `{ quiet: true }` is dotenv's surgical kill switch — it disables the two status `_log()` calls inside dotenv (the "injected env" line and the encrypted-`.env.vault` loader line, which this codebase doesn't trigger anyway). It does NOT silence `_warn()` (e.g. missing `.env.vault` for a configured `DOTENV_KEY`) or `_debug()` (only active with `debug: true`), so genuine misconfiguration warnings still surface. Applied at both `dotenv.config()` call sites — `src/config.env.ts:19` (top-level config bootstrap) and `src/core/common/helpers/config.helper.ts` (env-aware helper used by consumer projects). Verified by running `pnpm start` after the change — server boot log now starts directly with `Configured for: local` and the regular NestJS InstanceLoader output; both `injected env` lines and the `vestauth` promo are gone. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rning
- Reword AI Slot descriptions to drop `{{placeholders}}` notation (the
registry endpoint at `/ai/placeholders` already documents the active
token set) — silences three spectaql "Unsupported interpolation"
warnings without losing information.
- Resolve oxlint warnings: switch `!= null` / `== null` to strict
comparisons in the placeholder/prompt resolver paths, rename
`registry` → `placeholderRegistry` in two test cases to avoid
shadowing the outer `AiToolRegistry` variable, and add a
`no-useless-constructor` disable + explanatory comment on the two
built-in tool subclasses (their constructors are NOT useless — they
re-export the protected `AiTool` constructor as public so NestJS DI
can instantiate them).
- Add `oxc: false` to `vitest.config.ts` and `vitest-e2e.config.ts`:
Vite 8 switched the default TS/JS transformer from esbuild to Oxc;
`unplugin-swc` disables esbuild internally, but without the new flag
Oxc would still run in parallel and emit a deprecation warning on
every test run.
- Migration guide 11.25.x→11.26.0: new "Operational notes (no action
required)" section covering the dotenv banner suppression, the
vitest/Vite 8 `oxc: false` tip, and the Slot description rewrite.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…tpUser cookie→JWT fallback + extended overrides
Round of fixes from a multi-agent code review of the AI module. All 9 HIGH-severity
findings addressed; one pre-existing test-flakiness in `tests/ai.e2e-spec.ts` fixed
in the process.
Security (MCP OAuth 2.1):
- Bind refresh-token rotation to client_id (OAuth 2.1 §4.13.2 / §7.4) — prevents
a stolen refresh token from being rotated by a different client and used to
impersonate the original user. `rotateRefreshToken(token, clientId)` now
atomically `findOneAndDelete({ token, clientId })`; `exchangeRefreshToken`
rejects requests where `client.client_id` is missing or does not match.
- Return `client_secret`/`client_secret_expires_at`/`token_endpoint_auth_method`
from `getClient()` so the MCP SDK's `clientAuth` middleware can actually
verify the secret. Previously the secret was persisted but dropped on read,
silently downgrading confidential clients to public clients (the SDK's
secret-verify branch is gated by `if (client.client_secret)`).
Performance:
- Add `index: true` to `CoreAiConversation.createdBy` — previously a collection
scan for `findByOwner`.
- Add `select: '-messages'` to `findAiConversations` / `findConversations` —
conversation list previously over-fetched up to 500 messages per item.
Public API typing:
- Extend `ICoreModuleOverrides.ai` with `mcpClientService`, `modeService`,
`placeholderRegistry`, `promptHintService`, `promptService`, `slotService`,
`toolGrantService`, `toolPolicyService` (already accepted at runtime via
`CoreAiModule.forRoot(...)`; TypeScript was blocking consumers from passing
them).
- Add typed `IAi.claudeCli?: { bin?, extraArgs?, maxBudgetUsd? }` — was
already consumed by `ClaudeCliProvider` but untyped.
- Sync the `.claude/rules/configurable-features.md` AI row to the post-rename
surface (`aiSlots` / `/ai/slots` / `slotService`); previous text contradicted
the README, INTEGRATION-CHECKLIST and migration guide.
Tests:
- 21 new unit tests covering the OAuth 2.1 store flow (single-use auth code,
client_secret roundtrip, refresh-token rotation bound to client_id incl. the
stolen-token-rejected case, `exchangeAuthorizationCode` and
`exchangeRefreshToken` end-to-end through `buildOAuthProvider`,
`challengeForAuthorizationCode`) and the MCP HTTP session lifecycle
(`handlePost`/`handleGet`/`handleDelete` 401 with WWW-Authenticate, 404 for
unknown sessions, `resolveUser` precedence chain, `evictIfNeeded` cap).
- Fix flaky 401 in `createHttpUser` (pre-existing): BetterAuth's
`api.signInEmail()` does not always echo a session token into the response
body — in isolated test runs the body is `{ requiresTwoFactor, success, user }`
without `token` or `session`. Session is still established via the
`iam.session_token` Set-Cookie. `createHttpUser` now forwards that cookie to
`betterAuthService.getToken({ headers: { cookie } })` (the JWT plugin path)
to deterministically obtain a Bearer JWT — eliminates flaky 401s across the
HTTP-layer tests in this file.
Docs / housekeeping:
- Add `load-tests/ai-smoke.k6.js` as the AI request-pipeline smoke baseline.
- Persist reviewer agent memories for future review continuity.
Verification: 3 consecutive `pnpm run check` runs green (1915 tests pass,
build OK, server starts cleanly).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Add migration notes for the changes that landed in the previous commit: - Conversation list endpoint behaviour change (`select: '-messages'`) — clients that read `messages` from the list endpoint must switch to the per-id detail call. Surfaced both in "Operational notes" and as a⚠️ Behaviour change row in Compatibility Notes. - MCP OAuth security fixes — refresh-token rotation is now bound to `client_id` (impersonation fix) and `getClient()` returns `client_secret` so confidential clients actually verify it. Subclassers must mirror the new `rotateRefreshToken(token, clientId)` signature and not strip `client_secret` from the returned shape. - Typed `ICoreModuleOverrides.ai` (now exposes all 18 collaborators) and `IAi.claudeCli` — `as any` casts can be dropped. Overview Bugfixes row + Compatibility Notes updated to point readers at the new "Operational notes" entries. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Live test against a fresh fullstack workspace (mittwald
Ministral-3-14B-Instruct-2512) surfaced five bugs of varying severity.
This commit fixes all of them and adds the missing operational docs.
BUG-1 — HIGH: BSONError 500 from `loadRecentMessages` on bad conversationId
- A client that sends `conversationId: "null"` (literal string) crashed the
prompt pipeline with `BSONError: input must be a 24 character hex string`
bubbling up from Mongoose's ObjectId cast.
- `CoreAiConversationService.loadRecentMessages` now validates the id with
`Types.ObjectId.isValid` and fails-soft to `[]` (= "no history available"),
which is the same outcome as omitting the field. The orchestrator's
`loadConversationHistory` also early-returns on the literal strings
`"null"` / `"undefined"` before touching the service.
- Added 5 defensive-input regression tests.
BUG-2 — MEDIUM: 500 trace when @modelcontextprotocol/sdk is not installed
- `ai.mcp.enabled` is set but the SDK is missing → the lazy `import()` in
`CoreAiMcpController.handlePost` blew up with a raw require-stack 500.
- The controller now catches the `MODULE_NOT_FOUND` and returns
**503 Service Unavailable** with an actionable install hint
("Run `pnpm add @modelcontextprotocol/sdk` …"). Added a regression test.
BUG-3 — LOW: `POST /ai/slots/:id/reset` returned `true` instead of a slot
- The reset endpoint discarded the now-effective system default, so the
admin UI needed a second `/ai/slots/effective` call to refresh.
- Service now returns the synthetic `EffectiveSlot` shape
(`isSystem: true, isOverride: false, systemKey: <key>`) so a single call
is enough. Controller signature simplified to forward the slot.
BUG-4 — LOW: `update_user_job_title` reference tool didn't copy cleanly
- Starter projects whose `UserInput` doesn't carry a `jobTitle` field hit a
TS2339 on the literal `input.jobTitle = …` assignment.
- Tool now uses a dynamic `(input as Record<string, unknown>).jobTitle`
assignment + an inline JSDoc note that consumer projects must add the
field (or rename the tool) before using it. Also marked `mutating: true`
(closes SEC-009 from the prior review).
BUG-5 — LOW: `POST /ai/slots` returned `isOverride: null` for overrides
- `create`/`update` returned the raw `CoreAiSlot` shape which has no
`isOverride` / `isSystem` columns. Listing the same slot via
`/ai/slots/effective` correctly showed `isOverride: true` — inconsistent.
- Both mutations now run the result through a new `decorateWithSystemFlags`
helper that computes the flags against the system-default key set, so
create/update/listEffective all return the same shape.
Docs:
- Added "MCP server" install section to INTEGRATION-CHECKLIST.md, README.md
and the migration guide — projects that enable `ai.mcp` must
`pnpm add @modelcontextprotocol/sdk`. SDK is a peer-style optional
dependency that the controller lazy-imports only on the first MCP
request, so projects without MCP pay no install cost.
- Added two troubleshooting rows to the migration guide:
503 "MCP server unavailable" → install SDK; 500 BSONError → upgrade.
Verification: `pnpm run check` green (1921 tests pass — 1915 baseline +
6 new regression tests; build OK; server starts cleanly).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`permissions.md` is generated by `lt server permissions` and was repeatedly showing up in `git status` after running the scanner during reviews. Add it to `.gitignore` so the scanner can run freely without polluting the working tree. The local `ai-test-screenshots/` from the live mittwald test was a transient artifact and has been deleted. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Bumps
@lenne.tech/nest-serverto 11.26.0 and ships the full AI-assistant module: DB-backed encrypted LLM connections, provider-agnostic orchestrator (auto + plan mode), role-filtered tool registry, multi-turn conversations, SSE streaming, audit log, per-user/tenant token budgets, an MCP server (with optional OAuth 2.1), prioritized connection-resolution chain, tenant-scoped self-optimizing prompts (admin slots + governed learning loop), per-session context-window handling with auto-detection, runtime placeholder registry, and user-facing prompt templates ("Vorlagen") with run-time placeholder substitution.What's in here
No breaking changes for non-AI users
The module is opt-in. Projects without an `ai` config block see no behavioural change at all. The only breaking section ("AI naming overhaul") only applies to projects that pulled a pre-release AI build before 11.26.0 — fresh integrations follow the new naming from day one.
Companion PRs
Test plan
🤖 Generated with Claude Code