diff --git a/docs/architecture.md b/docs/architecture.md
index cfd610280..1830cd22c 100644
--- a/docs/architecture.md
+++ b/docs/architecture.md
@@ -972,8 +972,22 @@ Master Key (env var EDDI_VAULT_MASTER_KEY)
- **Envelope encryption**: Rotating the master key re-wraps KEKβDEK without touching individual secrets
- **Export scrubbing**: Agent export/sync automatically strips secrets from ZIP files
-### Authentication Model
+### Cryptographic Agent Identity
+
+EDDI agents can sign their inter-agent messages using Ed25519 digital signatures. This protects multi-agent group conversations against identity spoofing, message tampering, and provides non-repudiation for audit trails.
+
+**Key lifecycle:**
+1. **Key generation**: `POST /agentstore/{id}/signing/keys` β `AgentSigningService.generateKeyPair()` creates an Ed25519 keypair. Public key stored in `AgentConfiguration.identity.publicKey`, private key encrypted in the Secrets Vault
+2. **Key rotation**: `AgentPublicKey` records support versioned keys with `validFromMs`/`validUntilMs` windows. Old and new keys overlap during rotation. Private keys use versioned vault paths (`agent-signing-key:{agentId}:v{version}`)
+3. **Signing**: When `security.signInterAgentMessages=true`, the `GroupConversationService` creates a `SignedEnvelope` for each agent response. The envelope contains the message payload, a UUID nonce, and an epoch timestamp. The canonical JSON form (RFC 8785 via `JacksonCanonicalizer`) is signed with Ed25519
+4. **Self-verification**: Immediately after signing, the service verifies its own signature against the agent's public key. If self-verification fails, the signature is discarded (fail-safe to unsigned)
+5. **Replay protection**: The `NonceCacheService` registers each nonce with freshness (5min default) and clock-skew (30s default) checks. Duplicate nonces are rejected
+6. **Peer verification**: When `security.requirePeerVerification=true` on a receiving agent, the service reconstructs envelopes from stored `TranscriptEntry` fields and verifies each speaker's signature against their public key before sending context
+
+**What is NOT covered:** MCP invocation signing is not yet implemented β the `signMcpInvocations` config field has been removed until the feature is built.
+
+### Authentication Model
| Environment | OIDC Enabled | Behavior |
|-------------|-------------|----------|
| **Dev mode** | No | Allowed β info log on startup |
diff --git a/docs/changelog.md b/docs/changelog.md
index 168b2e1d2..092e374c9 100644
--- a/docs/changelog.md
+++ b/docs/changelog.md
@@ -4,6 +4,151 @@
---
+## π οΈ PR Feedback Remediation β Production Hardening (2026-05-17)
+
+**Repo:** EDDI (`feature/feature-gap-remediation`)
+**What changed:** Addressed ~25 findings from CodeQL, code quality bot, Copilot, and CodeRabbit reviews. All actionable items resolved.
+
+### Security Fixes
+- **NonceCacheService TOCTOU:** Replaced non-atomic `get()`+`put()` with `putIfAbsent()` for replay detection. The get-then-put pattern allowed two concurrent requests with the same nonce to both pass the replay check.
+- **NonceCacheService null guard:** Added null/blank nonce early rejection.
+- **Log injection (centralized):** Replaced per-file `sanitizeForLog()` methods in `GroupConversationService`, `MongoTenantQuotaStore`, `PostgresTenantQuotaStore` with centralized `LogSanitizer.sanitize()`. Added Unicode line separator (U+2028/U+2029) handling per CodeQL feedback. Also wrapped `e.getMessage()` in log calls.
+- **Fail-closed cost accounting:** `PostgresTenantQuotaStore.tryAddCost()` now returns `DENIED` on SQL failure instead of `OK` β prevents budget bypass when database is unreachable.
+- **Key version validation:** `AgentSigningService.generateKeyPairVersioned()` and `rotateKey()` now reject `version <= 0`.
+- **JacksonCanonicalizer strict duplicate detection:** Enabled `StreamReadFeature.STRICT_DUPLICATE_DETECTION` to prevent collision attacks where different JSON payloads produce identical canonical output. Removed inaccurate RFC 8785 claim from javadoc.
+- **AgentSigningService versioned key cleanup:** `deleteKeyPair()` now deletes both legacy unversioned and all versioned vault secrets. `generateKeyPairVersioned()` now evicts version-specific cache entries.
+
+### Performance Fixes
+- **Incremental peer verification:** `verifyPriorEntriesIfRequired()` now tracks last-verified transcript index per conversation (O(N) amortized instead of O(NΒ²) per-turn re-verification). Public keys cached per speaker to avoid redundant `agentStore` lookups.
+- **signEnvelope private key caching:** Now uses `privateKeyCache.computeIfAbsent()` with versioned cache key, avoiding vault round-trips on every call.
+
+### Architecture Fixes
+- **DiscoverToolsTool CDI exclusion:** Added `@Vetoed` to prevent Quarkus CDI from auto-discovering the class as a bean (it is manually constructed by AgentOrchestrator).
+- **LAZY tool activation:** Fixed gap where discovered tools couldn't actually be called. `collectEnabledTools()` now returns ALL tools (registering executors), while `executeWithTools()` initially presents only `discover_tools` spec. After the LLM calls `discover_tools`, matching built-in specs are activated via `activateDiscoveredTools()`.
+- **PostgresTenantQuotaStore transactional delete:** `deleteQuota()` now wraps both `tenant_quotas` and `tenant_usage` deletes in a single transaction with rollback on failure.
+- **PostgresTenantQuotaStore schema auto-creation:** Added `CREATE TABLE IF NOT EXISTS` with `ensureSchema()` pattern (matching `PostgresGlobalVariableStore`, `PostgresSecretPersistence`, etc.).
+- **MongoTenantQuotaStore unique index:** Added unique ascending index on `tenantId` for both `tenant_quotas` and `tenant_usage` collections to prevent duplicate rows from upsert races.
+- **DiscoverToolsTool JSON serialization:** Replaced manual `StringBuilder` JSON assembly with Jackson `ObjectMapper` for proper escaping of special characters in tool descriptions.
+- **JacksonCanonicalizer overload rename:** `canonicalize(Object)` β `canonicalizeObject(Object)` to eliminate static dispatch ambiguity.
+- **GroupConversationService FQN cleanup:** Replaced 5 fully-qualified class references (`ai.labs.eddi.configs.agents.crypto.*`) with proper imports.
+- **AgentOrchestrator log fix:** Compute external tool count explicitly instead of `activeSpecs.size() - 1` to avoid misleading `-1` in logs.
+
+### Changelog accuracy
+- Fixed Item 1 and Item 2 descriptions below (see corrections inline).
+
+**Files:** `NonceCacheService.java`, `GroupConversationService.java`, `MongoTenantQuotaStore.java`, `PostgresTenantQuotaStore.java`, `AgentSigningService.java`, `AgentOrchestrator.java`, `DiscoverToolsTool.java`, `JacksonCanonicalizer.java`, `SignedEnvelope.java`, `LogSanitizer.java`, `changelog.md`
+
+---
+
+
+## π‘οΈ Crypto Security Review β Fail-Safe Remediations (2026-05-15)
+
+**Repo:** EDDI (`feature/feature-gap-remediation`)
+**What changed:** Security-focused code review identified 7 findings (2 high, 3 medium, 2 low). All remediated. Key principle: signing failures are **fail-safe** β discard the broken signature and fall back to unsigned, rather than storing broken data.
+
+### S1+S2 (HIGH): Signing failures now fail-safe to unsigned
+- Self-verify failure (`verifyEnvelope` returns false) β discard signature, fall back to unsigned entry
+- Nonce validation failure β discard signature, fall back to unsigned entry
+- Previously: logged warning/error but continued with broken signature stored permanently
+
+### S3+S4 (MEDIUM): Null guards for crypto infrastructure
+- Signing block: `agentStore`, `agentSigningService`, `nonceCacheService` all guarded for null
+- `agentConfig.getIdentity()` guarded before `getKeyValidAt()` call
+
+### S7 (LOW): NonceCacheService unused `ttlMs` variable
+- Removed computed `ttlMs` that was never passed to cache factory
+- Added documentation comment explaining the cache TTL configuration requirement
+
+### Tests: 15 new tests (84 total affected)
+- `TranscriptEntry`: full 13-param constructor, `hasEnvelopeData()` (4 edge cases), signature-only constructor
+
+### Docs updated
+- `docs/architecture.md`: added Cryptographic Agent Identity section
+- `planning/manager-ui-handoff.md`: removed `signMcpInvocations`, `forkingEnabled`, `maxForksPerConversation`, updated Security section to show active signing flags
+
+---
+
+## π Cryptographic Agent Identity β End-to-End Hardening (2026-05-15)
+
+**Repo:** EDDI (`feature/feature-gap-remediation`)
+**What changed:** Evolved the partial SignedEnvelope infrastructure into a fully-wired, production-standard cryptographic identity system. Removed dead config fields, added peer verification, and made all security features functional.
+
+### Config Cleanup β Remove Dead Fields
+- **Removed:** `signMcpInvocations` from `SecurityConfig` (no MCP signing implementation exists)
+- **Removed:** `forkingEnabled` + `maxForksPerConversation` from `SessionManagement` (no forking service exists)
+- **Rationale:** "Configs without functionality" creates false confidence. Features are added alongside their implementation, not before.
+- **Files:** `AgentConfiguration.java`, `RestAgentStore.java` (removed `validateSessionFlags()`), tests updated
+
+### TranscriptEntry β Full Envelope Storage
+- **Added:** `signatureNonce`, `signatureTimestampMs`, `signatureKeyVersion` fields to `TranscriptEntry` record
+- **Added:** `hasEnvelopeData()` convenience method for verification checks
+- **Backward-compatible:** Two compact constructors for unsigned and signature-only entries
+- **Files:** `GroupConversation.java`
+
+### GroupConversationService β End-to-End Crypto Wiring
+- **Injected:** `NonceCacheService` for replay protection
+- **Signing block:** Now creates full `SignedEnvelope` with nonce, immediately self-verifies, registers nonce, and stores all envelope fields in `TranscriptEntry`
+- **Added:** `verifyPriorEntriesIfRequired()` β when receiving agent has `requirePeerVerification=true`, reconstructs envelopes from stored fields and verifies each speaker's signature against their public key
+- **Defense-in-depth:** Signing self-verifies at creation time; peer verification at consumption time catches key rotation issues or data corruption
+- **Files:** `GroupConversationService.java`
+
+### LlmConfiguration β Configurable maxToolsInContext
+- **Added:** `maxToolsInContext` field (default: 20) to `LlmConfiguration.Task` for LAZY tool loading
+- **Previously:** Hardcoded `int maxToolsInContext = 20` in `AgentOrchestrator`
+- **Files:** `LlmConfiguration.java`, `AgentOrchestrator.java`
+
+### MongoTenantQuotaStore β TOCTOU Documentation
+- **Added:** Comment documenting the minor TOCTOU race at window boundaries in multi-instance deployments
+- **Files:** `MongoTenantQuotaStore.java`
+
+### Test Fixes
+- Updated `SessionManagementTest`, `AgentConfigurationTest`, `RestAgentStoreTest` β removed references to deleted fields
+- Updated `GroupConversationServiceTest` β added `NonceCacheService` constructor parameter
+- All 69 affected tests pass (0 failures, 0 errors)
+
+---
+
+## π§ Feature Gap Remediation β 6 Items Resolved (2026-05-15)
+
+**Repo:** EDDI (`feature/feature-gap-remediation`)
+**What changed:** Systematic audit found 8 gaps between documented features and actual implementation. Fixed 6 items (2 required no changes).
+
+### Item 1: Session Forking β Config Removed
+- **Problem:** `forkingEnabled=true` accepted silently but no `ConversationForkService` exists
+- **Original fix:** Added `validateSessionFlags()` in `RestAgentStore` to reject the flag with a clear error
+- **Final state:** Both `forkingEnabled` and `maxForksPerConversation` config fields were fully removed (config-without-functionality anti-pattern). `validateSessionFlags()` was also removed since there are no session flags left to validate.
+- **Files:** `AgentConfiguration.java`, `RestAgentStore.java`
+
+### Item 2: Signing Flags β Config Removed
+- **Problem:** `signMcpInvocations` flag accepted silently but no MCP signing implementation exists
+- **Original fix:** Split `validateSecurityFlags()` to reject `signMcpInvocations` while allowing `signInterAgentMessages` and `requirePeerVerification`
+- **Final state:** `signMcpInvocations` field was fully removed from `SecurityConfig`. The validation method was also removed since both remaining flags (`signInterAgentMessages`, `requirePeerVerification`) now have runtime implementations.
+- **Files:** `AgentConfiguration.java`, `RestAgentStore.java`
+
+### Item 3: DiscoverToolsTool β Recovered + Wired
+- **Problem:** Token-saving lazy tool loading deleted as dead code (commit `05edf602`)
+- **Fix:** Recovered `DiscoverToolsTool.java` + test, added `ToolLoadingStrategy` enum (EAGER/LAZY) to `LlmConfiguration.Task`, wired LAZY branch into `AgentOrchestrator.collectEnabledTools()` β when LAZY, only `discover_tools` meta-tool is sent initially, LLM discovers available tools, specs injected mid-loop
+- **Files:** `DiscoverToolsTool.java` (recovered), `LlmConfiguration.java`, `AgentOrchestrator.java`
+
+### Item 4: Cryptographic Infrastructure β Recovered + Wired
+- **Problem:** `SignedEnvelope`, `JacksonCanonicalizer`, `NonceCacheService` deleted as dead code (commit `4a717fa5`)
+- **Fix:** Recovered all 3 files + tests, re-added `signEnvelope()`/`verifyEnvelope()`/`rotateKey()`/`generateKeyPairVersioned()` to `AgentSigningService`, upgraded `GroupConversationService` signing from simple string signing to full `SignedEnvelope` with nonce-based replay protection
+- **Files:** `SignedEnvelope.java`, `JacksonCanonicalizer.java`, `NonceCacheService.java` (all recovered), `AgentSigningService.java`, `GroupConversationService.java`
+
+### Item 5: Tenant Quota DB Persistence β Dual-Backend Stores
+- **Problem:** `ITenantQuotaStore` only had `InMemoryTenantQuotaStore` β restarts reset all quota counters, no cross-instance synchronization
+- **Fix:** Created `MongoTenantQuotaStore` (uses `findAndModify` for atomicity) and `PostgresTenantQuotaStore` (uses `UPDATE...WHERE...RETURNING`), wired into `DataStoreProducers` following existing dual-backend pattern
+- **Files:** `MongoTenantQuotaStore.java` (new), `PostgresTenantQuotaStore.java` (new), `DataStoreProducers.java`
+
+### Item 6: NATS Documentation
+- NATS code works correctly for what it does (durable ordered processing with retry/dead-letter)
+- No code changes needed β documentation accuracy to be addressed separately
+
+### Items 7-8: No Changes Needed
+- HIPAA docs accurately describe documentation, not code enforcement
+- OpenTelemetry opt-in is standard industry practice
+
+
## How to Read This Document
Each entry follows this format:
diff --git a/planning/manager-ui-handoff.md b/planning/manager-ui-handoff.md
index bbb4adda0..a4cb2db07 100644
--- a/planning/manager-ui-handoff.md
+++ b/planning/manager-ui-handoff.md
@@ -179,7 +179,6 @@ These fields live on the **agent** object itself.
{
"security": {
"signInterAgentMessages": false,
- "signMcpInvocations": false,
"requirePeerVerification": false
}
}
@@ -187,12 +186,11 @@ These fields live on the **agent** object itself.
| Field | Type | Default | UI Widget |
|-------|------|---------|-----------|
-| `signInterAgentMessages` | `boolean` | `false` | Toggle (disabled) |
-| `signMcpInvocations` | `boolean` | `false` | Toggle (disabled) |
-| `requirePeerVerification` | `boolean` | `false` | Toggle (disabled) |
+| `signInterAgentMessages` | `boolean` | `false` | Toggle |
+| `requirePeerVerification` | `boolean` | `false` | Toggle |
-> [!WARNING]
-> These flags are **validated but not yet wired**. The backend rejects `true` with HTTP 400 until the full signing pipeline is activated. The Manager should render them as **disabled toggles** with a tooltip: *"Available in a future release"*.
+> [!IMPORTANT]
+> Both flags are **fully wired** and operational as of v6.0.2. Enabling either requires a valid Ed25519 keypair on the agent's identity block (the backend validates on save). The Manager should show a validation error if the toggle is enabled but no keypair exists.
---
@@ -236,9 +234,7 @@ These fields live on the **agent** object itself.
"enabled": false,
"triggerOn": ["before_tool"]
},
- "forkingEnabled": false,
- "maxCheckpointsPerConversation": 10,
- "maxForksPerConversation": 5
+ "maxCheckpointsPerConversation": 10
}
}
```
@@ -247,12 +243,10 @@ These fields live on the **agent** object itself.
|-------|------|---------|--------|-----------|
| `autoSnapshot.enabled` | `boolean` | `false` | β | Toggle switch |
| `autoSnapshot.triggerOn` | `string[]` | `[]` | `before_tool`, `before_action` | Multi-select checkboxes |
-| `forkingEnabled` | `boolean` | `false` | β | Toggle (disabled β future feature) |
| `maxCheckpointsPerConversation` | `int` | `10` | 1β100 | Number input |
-| `maxForksPerConversation` | `int` | `5` | 1β50 | Number input (disabled β future feature) |
**UX Notes:**
-- `forkingEnabled` and `maxForksPerConversation` should be rendered as **disabled** with tooltip *"Available in a future release"*
+- Session forking (`forkingEnabled`, `maxForksPerConversation`) has been **removed** from the config β it will be re-added when the forking service is implemented
- When `autoSnapshot.enabled` is `false`, collapse sub-fields
---
@@ -369,11 +363,11 @@ Two new condition types are available in the behavior rule editor.
β Identity: did:eddi:agent-1 [Generate Keypair] β
β Public Key: MCowBQYDK2Vw... (read-only) β
β β
-β β Signing Flags (not yet wired) βββββββββββββββββββββ
-β β β Sign inter-agent messages (disabled) ββ
-β β β Sign MCP invocations (disabled) ββ
-β β β Require peer verification (disabled) ββ
-β βββββββββββββββββββββββββββββββββββββββββββββββββββββ
+β β Signing Flags ββββββββββββββββββββββββββββββββββ β
+β β β Sign inter-agent messages [ON/OFF] β β
+β β β Require peer verification [ON/OFF] β β
+β β β Requires keypair to enable β β
+β ββββββββββββββββββββββββββββββββββββββββββββββββββ β
β β
β ββ Memory Tab βββββββββββββββββββββββββββββββββββββββ
β Strict Write Discipline: [OFF] β
@@ -383,7 +377,7 @@ Two new condition types are available in the behavior rule editor.
β Auto Snapshot: [OFF] β
β Trigger On: β before_tool β before_action β
β Max Checkpoints: [10] β
-β Forking: (disabled β future release) β
+β β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
```
@@ -442,9 +436,8 @@ interface AgentConfiguration {
};
security?: {
- signInterAgentMessages: boolean; // default: false, NOT YET WIRED
- signMcpInvocations: boolean; // default: false, NOT YET WIRED
- requirePeerVerification: boolean; // default: false, NOT YET WIRED
+ signInterAgentMessages: boolean; // default: false, requires keypair
+ requirePeerVerification: boolean; // default: false, requires keypair
};
memoryPolicy?: {
@@ -459,9 +452,7 @@ interface AgentConfiguration {
enabled: boolean; // default: false
triggerOn: string[]; // 'before_tool', 'before_action'
};
- forkingEnabled: boolean; // default: false, NOT YET WIRED
maxCheckpointsPerConversation: number; // default: 10
- maxForksPerConversation: number; // default: 5, NOT YET WIRED
};
}
```
@@ -508,4 +499,4 @@ interface CapabilityMatchConfigs {
| π’ P2 | Session Management | Small | Toggle + checkboxes, but forking is deferred |
| π’ P2 | Behavior Conditions | Medium | Extends existing condition editor with 2 new types |
| βͺ P3 | Cryptographic Identity | Small | Mostly read-only display + one button |
-| βͺ P3 | Security Flags | Trivial | 3 disabled toggles with "coming soon" tooltip |
+| βͺ P3 | Security Flags | Small | 2 toggles (active), require keypair validation |
diff --git a/src/main/java/ai/labs/eddi/configs/agents/AgentSigningService.java b/src/main/java/ai/labs/eddi/configs/agents/AgentSigningService.java
index 40ecf08b6..8f80bb67d 100644
--- a/src/main/java/ai/labs/eddi/configs/agents/AgentSigningService.java
+++ b/src/main/java/ai/labs/eddi/configs/agents/AgentSigningService.java
@@ -50,6 +50,8 @@ public class AgentSigningService {
private static final Logger LOGGER = Logger.getLogger(AgentSigningService.class);
private static final String ALGORITHM = "Ed25519";
private static final String VAULT_KEY_PREFIX = "agent-signing-key:";
+ /** Maximum key version to scan during deletion cleanup. */
+ private static final int MAX_KEY_VERSION_SCAN = 100;
private final ISecretProvider secretProvider;
private final MeterRegistry meterRegistry;
@@ -196,14 +198,28 @@ public boolean verify(String publicKeyB64, String payload, String signatureB64)
}
/**
- * Delete the signing keypair for an agent (cleanup on agent deletion).
+ * Delete the signing keypair for an agent (cleanup on agent deletion). Removes
+ * both the legacy unversioned key and any versioned keys found.
*/
public void deleteKeyPair(String tenantId, String agentId) {
try {
+ // Delete legacy unversioned key
SecretReference ref = new SecretReference(tenantId, vaultKeyName(agentId));
secretProvider.delete(ref);
privateKeyCache.remove(cacheKey(tenantId, agentId));
- LOGGER.infof("Deleted signing key for agent '%s' in tenant '%s' (cache evicted)", agentId, tenantId);
+
+ // Delete versioned keys (scan reasonable range)
+ for (int v = 1; v <= MAX_KEY_VERSION_SCAN; v++) {
+ try {
+ SecretReference vRef = new SecretReference(tenantId, vaultKeyNameVersioned(agentId, v));
+ secretProvider.delete(vRef);
+ privateKeyCache.remove(cacheKey(tenantId, agentId) + ";v=" + v);
+ } catch (Exception ignored) {
+ // Version doesn't exist β stop scanning
+ break;
+ }
+ }
+ LOGGER.infof("Deleted signing keys for agent '%s' in tenant '%s' (cache evicted)", agentId, tenantId);
} catch (Exception e) {
LOGGER.warnf("Failed to delete signing key for agent '%s': %s", agentId, e.getMessage());
}
@@ -213,6 +229,10 @@ private String vaultKeyName(String agentId) {
return VAULT_KEY_PREFIX + agentId;
}
+ private String vaultKeyNameVersioned(String agentId, int version) {
+ return VAULT_KEY_PREFIX + agentId + ":v" + version;
+ }
+
/**
* Collision-resistant cache key: uses a structured format so that
* tenantId="a:b", agentId="c" cannot collide with tenantId="a", agentId="b:c".
@@ -221,6 +241,154 @@ private static String cacheKey(String tenantId, String agentId) {
return "tenant=" + tenantId + ";agent=" + agentId;
}
+ /**
+ * Generate a versioned keypair for key rotation.
+ *
+ * @param tenantId
+ * the tenant identifier
+ * @param agentId
+ * the agent identifier
+ * @param version
+ * the key version number
+ * @return the Base64-encoded public key
+ * @throws AgentSigningException
+ * if key generation fails
+ */
+ public String generateKeyPairVersioned(String tenantId, String agentId, int version) throws AgentSigningException {
+ if (version <= 0) {
+ throw new AgentSigningException("Key version must be positive, got: " + version, null);
+ }
+ try {
+ KeyPairGenerator keyGen = KeyPairGenerator.getInstance(ALGORITHM);
+ KeyPair keyPair = keyGen.generateKeyPair();
+
+ String publicKeyB64 = Base64.getEncoder().encodeToString(keyPair.getPublic().getEncoded());
+ String privateKeyB64 = Base64.getEncoder().encodeToString(keyPair.getPrivate().getEncoded());
+
+ // Store versioned private key in vault
+ SecretReference ref = new SecretReference(tenantId, vaultKeyNameVersioned(agentId, version));
+ secretProvider.store(ref, privateKeyB64,
+ "Ed25519 signing key v" + version + " for agent " + agentId,
+ List.of(agentId));
+
+ // Evict version-specific cached private key so the new key is used immediately
+ privateKeyCache.remove(cacheKey(tenantId, agentId) + ";v=" + version);
+ // Also evict the legacy unversioned entry (if any)
+ privateKeyCache.remove(cacheKey(tenantId, agentId));
+
+ LOGGER.infof("Generated Ed25519 keypair v%d for agent '%s' in tenant '%s'", version, agentId, tenantId);
+ return publicKeyB64;
+ } catch (NoSuchAlgorithmException e) {
+ throw new AgentSigningException("Ed25519 not available in JVM", e);
+ } catch (ISecretProvider.SecretProviderException e) {
+ throw new AgentSigningException("Failed to store private key in vault", e);
+ }
+ }
+
+ /**
+ * Sign a {@link ai.labs.eddi.configs.agents.crypto.SignedEnvelope} using the
+ * agent's versioned key.
+ *
+ * @param tenantId
+ * the tenant identifier
+ * @param agentId
+ * the agent identifier
+ * @param envelope
+ * the unsigned envelope
+ * @param keyVersion
+ * the key version to use for signing
+ * @return the signed envelope
+ * @throws AgentSigningException
+ * if signing fails
+ */
+ public ai.labs.eddi.configs.agents.crypto.SignedEnvelope signEnvelope(
+ String tenantId, String agentId,
+ ai.labs.eddi.configs.agents.crypto.SignedEnvelope envelope,
+ int keyVersion)
+ throws AgentSigningException {
+ try {
+ String canonicalForm = envelope.canonicalForm();
+ String vaultKey = keyVersion > 0
+ ? vaultKeyNameVersioned(agentId, keyVersion)
+ : vaultKeyName(agentId);
+
+ // Use versioned cache key so different key versions don't collide
+ String cacheKeyStr = keyVersion > 0
+ ? cacheKey(tenantId, agentId) + ";v=" + keyVersion
+ : cacheKey(tenantId, agentId);
+
+ PrivateKey privateKey = privateKeyCache.computeIfAbsent(cacheKeyStr, k -> {
+ try {
+ SecretReference ref = new SecretReference(tenantId, vaultKey);
+ String privateKeyB64 = secretProvider.resolve(ref);
+ byte[] privateKeyBytes = Base64.getDecoder().decode(privateKeyB64);
+ KeyFactory keyFactory = KeyFactory.getInstance(ALGORITHM);
+ return keyFactory.generatePrivate(
+ new java.security.spec.PKCS8EncodedKeySpec(privateKeyBytes));
+ } catch (Exception e) {
+ throw new PrivateKeyLoadException(agentId, e);
+ }
+ });
+
+ Signature sig = Signature.getInstance(ALGORITHM);
+ sig.initSign(privateKey);
+ sig.update(canonicalForm.getBytes(StandardCharsets.UTF_8));
+ String signatureB64 = Base64.getEncoder().encodeToString(sig.sign());
+
+ signCounter.increment();
+ return envelope.withSignature(signatureB64, keyVersion);
+ } catch (PrivateKeyLoadException e) {
+ Throwable cause = e.getCause();
+ throw new AgentSigningException("Envelope signing failed for agent " + agentId
+ + ": " + cause.getClass().getSimpleName(), cause);
+ } catch (Exception e) {
+ throw new AgentSigningException("Envelope signing failed for agent " + agentId, e);
+ }
+ }
+
+ /**
+ * Verify a signed envelope against a public key.
+ *
+ * @param envelope
+ * the signed envelope to verify
+ * @param publicKeyB64
+ * the Base64-encoded public key
+ * @return true if the signature is valid
+ */
+ public boolean verifyEnvelope(ai.labs.eddi.configs.agents.crypto.SignedEnvelope envelope, String publicKeyB64) {
+ try {
+ String canonicalForm = envelope.canonicalForm();
+ return verify(publicKeyB64, canonicalForm, envelope.signature());
+ } catch (Exception e) {
+ LOGGER.warnf("Envelope verification failed: %s", e.getMessage());
+ verifyFailCounter.increment();
+ return false;
+ }
+ }
+
+ /**
+ * Rotate the signing key for an agent. Creates a new versioned key and returns
+ * the public key for it.
+ *
+ * @param tenantId
+ * the tenant identifier
+ * @param agentId
+ * the agent identifier
+ * @param newVersion
+ * the new key version number
+ * @return the Base64-encoded new public key
+ * @throws AgentSigningException
+ * if rotation fails
+ */
+ public String rotateKey(String tenantId, String agentId, int newVersion) throws AgentSigningException {
+ if (newVersion <= 0) {
+ throw new AgentSigningException("Key version must be positive, got: " + newVersion, null);
+ }
+ String publicKeyB64 = generateKeyPairVersioned(tenantId, agentId, newVersion);
+ LOGGER.infof("Rotated signing key for agent '%s' to version %d", agentId, newVersion);
+ return publicKeyB64;
+ }
+
public static class AgentSigningException extends Exception {
public AgentSigningException(String message, Throwable cause) {
super(message, cause);
diff --git a/src/main/java/ai/labs/eddi/configs/agents/crypto/JacksonCanonicalizer.java b/src/main/java/ai/labs/eddi/configs/agents/crypto/JacksonCanonicalizer.java
new file mode 100644
index 000000000..377fffc72
--- /dev/null
+++ b/src/main/java/ai/labs/eddi/configs/agents/crypto/JacksonCanonicalizer.java
@@ -0,0 +1,103 @@
+/*
+ * Copyright EDDI contributors
+ * SPDX-License-Identifier: Apache-2.0
+ */
+package ai.labs.eddi.configs.agents.crypto;
+
+import com.fasterxml.jackson.core.JsonFactory;
+import com.fasterxml.jackson.core.JsonProcessingException;
+import com.fasterxml.jackson.core.StreamReadFeature;
+import com.fasterxml.jackson.databind.JsonNode;
+import com.fasterxml.jackson.databind.ObjectMapper;
+import com.fasterxml.jackson.databind.SerializationFeature;
+import com.fasterxml.jackson.databind.node.ArrayNode;
+import com.fasterxml.jackson.databind.node.ObjectNode;
+
+import java.util.Iterator;
+import java.util.TreeMap;
+
+/**
+ * Deterministic JSON canonicalization for cryptographic signing.
+ *
+ * Produces a deterministic JSON string by:
+ *
+ *
Sorting all object keys lexicographically (recursive)
+ *
Removing insignificant whitespace
+ *
+ *
+ * Note: This is NOT a full RFC 8785 (JCS) implementation β
+ * Jackson's default numeric serialization is used. Since EDDI's signed payloads
+ * contain only string fields, this is sufficient for inter-agent signature
+ * verification. If numeric canonicalization becomes necessary, a dedicated JCS
+ * library should be adopted.
+ *
+ * Strict duplicate key detection is enabled to prevent collision attacks where
+ * different JSON payloads produce identical canonical output.
+ *
+ * @since 6.0.0
+ */
+public final class JacksonCanonicalizer {
+
+ private static final ObjectMapper MAPPER = new ObjectMapper(
+ JsonFactory.builder()
+ .enable(StreamReadFeature.STRICT_DUPLICATE_DETECTION)
+ .build())
+ .configure(SerializationFeature.ORDER_MAP_ENTRIES_BY_KEYS, true);
+
+ private JacksonCanonicalizer() {
+ // utility class
+ }
+
+ /**
+ * Canonicalize a JSON string per RFC 8785.
+ *
+ * @param json
+ * the input JSON string
+ * @return canonicalized JSON string with sorted keys and no whitespace
+ * @throws JsonProcessingException
+ * if the input is not valid JSON
+ */
+ public static String canonicalize(String json) throws JsonProcessingException {
+ JsonNode node = MAPPER.readTree(json);
+ JsonNode sorted = sortKeys(node);
+ return MAPPER.writeValueAsString(sorted);
+ }
+
+ /**
+ * Canonicalize a Java object by serializing it to JSON first.
+ *
+ * @param obj
+ * the object to canonicalize
+ * @return canonicalized JSON string
+ * @throws JsonProcessingException
+ * if serialization fails
+ */
+ public static String canonicalizeObject(Object obj) throws JsonProcessingException {
+ JsonNode node = MAPPER.valueToTree(obj);
+ JsonNode sorted = sortKeys(node);
+ return MAPPER.writeValueAsString(sorted);
+ }
+
+ private static JsonNode sortKeys(JsonNode node) {
+ if (node.isObject()) {
+ ObjectNode objectNode = (ObjectNode) node;
+ TreeMap sortedMap = new TreeMap<>();
+ Iterator fieldNames = objectNode.fieldNames();
+ while (fieldNames.hasNext()) {
+ String field = fieldNames.next();
+ sortedMap.put(field, sortKeys(objectNode.get(field)));
+ }
+ ObjectNode sortedNode = MAPPER.createObjectNode();
+ sortedMap.forEach(sortedNode::set);
+ return sortedNode;
+ } else if (node.isArray()) {
+ ArrayNode arrayNode = (ArrayNode) node;
+ ArrayNode sortedArray = MAPPER.createArrayNode();
+ for (JsonNode element : arrayNode) {
+ sortedArray.add(sortKeys(element));
+ }
+ return sortedArray;
+ }
+ return node;
+ }
+}
diff --git a/src/main/java/ai/labs/eddi/configs/agents/crypto/NonceCacheService.java b/src/main/java/ai/labs/eddi/configs/agents/crypto/NonceCacheService.java
new file mode 100644
index 000000000..9bca884d5
--- /dev/null
+++ b/src/main/java/ai/labs/eddi/configs/agents/crypto/NonceCacheService.java
@@ -0,0 +1,132 @@
+/*
+ * Copyright EDDI contributors
+ * SPDX-License-Identifier: Apache-2.0
+ */
+package ai.labs.eddi.configs.agents.crypto;
+
+import ai.labs.eddi.engine.caching.ICacheFactory;
+import ai.labs.eddi.engine.caching.ICache;
+import io.micrometer.core.instrument.Counter;
+import io.micrometer.core.instrument.MeterRegistry;
+import jakarta.annotation.PostConstruct;
+import jakarta.enterprise.context.ApplicationScoped;
+import jakarta.inject.Inject;
+import org.eclipse.microprofile.config.inject.ConfigProperty;
+import org.jboss.logging.Logger;
+
+import java.time.Instant;
+
+/**
+ * Nonce-based replay protection for signed envelopes.
+ *
+ * Three-stage validation:
+ *
+ *
Freshness: Reject if {@code timestampMs} is older than
+ * {@code maxAgeMs} (default 5 minutes)
+ *
Clock skew: Reject if {@code timestampMs} is more than
+ * {@code clockSkewMs} into the future (default 30 seconds)
+ *
Duplicate: Reject if the nonce was already seen within
+ * the TTL window
+ *
+ *
+ * @since 6.0.0
+ */
+@ApplicationScoped
+public class NonceCacheService {
+
+ private static final Logger LOGGER = Logger.getLogger(NonceCacheService.class);
+ private static final String CACHE_NAME = "nonce-replay-protection";
+
+ @ConfigProperty(name = "eddi.a2a.signing.nonce.max-age-ms", defaultValue = "300000") // 5 min
+ long maxAgeMs;
+
+ @ConfigProperty(name = "eddi.a2a.signing.nonce.clock-skew-ms", defaultValue = "30000") // 30 sec
+ long clockSkewMs;
+
+ private final ICacheFactory cacheFactory;
+ private final MeterRegistry meterRegistry;
+ private ICache nonceCache;
+ private Counter replayRejections;
+ private Counter freshnessRejections;
+ private Counter clockSkewRejections;
+
+ @Inject
+ public NonceCacheService(ICacheFactory cacheFactory, MeterRegistry meterRegistry) {
+ this.cacheFactory = cacheFactory;
+ this.meterRegistry = meterRegistry;
+ }
+
+ @PostConstruct
+ void init() {
+ // The cache TTL must be >= maxAge + clockSkew to cover the full replay window.
+ // Minimum required TTL: maxAge + clockSkew + buffer = ~340s with defaults.
+ // This is configured externally via the ICacheFactory cache configuration
+ // (e.g., Caffeine expireAfterWrite in application.properties).
+ this.nonceCache = cacheFactory.getCache(CACHE_NAME);
+
+ replayRejections = meterRegistry.counter("eddi.agent.nonce.replay.rejected");
+ freshnessRejections = meterRegistry.counter("eddi.agent.nonce.freshness.rejected");
+ clockSkewRejections = meterRegistry.counter("eddi.agent.nonce.clockskew.rejected");
+ }
+
+ /**
+ * Validate a nonce + timestamp combination.
+ *
+ * @param nonce
+ * the unique nonce from the envelope
+ * @param timestampMs
+ * the envelope creation timestamp in epoch milliseconds
+ * @return validation result
+ */
+ public NonceValidation validate(String nonce, long timestampMs) {
+ // Reject null/blank nonces immediately
+ if (nonce == null || nonce.isBlank()) {
+ LOGGER.warn("Nonce validation failed: nonce is null or blank");
+ return NonceValidation.REPLAY; // Treat as invalid β same effect as replay
+ }
+
+ long now = Instant.now().toEpochMilli();
+
+ // 1. Freshness check
+ if ((now - timestampMs) > maxAgeMs) {
+ freshnessRejections.increment();
+ LOGGER.debugf("Nonce '%s' rejected: too old (%d ms age, max %d ms)", nonce, now - timestampMs, maxAgeMs);
+ return NonceValidation.TOO_OLD;
+ }
+
+ // 2. Clock skew check
+ if ((timestampMs - now) > clockSkewMs) {
+ clockSkewRejections.increment();
+ LOGGER.debugf("Nonce '%s' rejected: future timestamp (%d ms ahead, max skew %d ms)",
+ nonce, timestampMs - now, clockSkewMs);
+ return NonceValidation.CLOCK_SKEW;
+ }
+
+ // 3. Atomic replay check β putIfAbsent returns null on successful insertion,
+ // existing value if already present. This eliminates the TOCTOU race between
+ // get() and put() that could allow two concurrent requests with the same nonce
+ // to both pass the replay check.
+ Boolean existing = nonceCache.putIfAbsent(nonce, Boolean.TRUE);
+ if (existing != null) {
+ replayRejections.increment();
+ LOGGER.debugf("Nonce '%s' rejected: replay detected", nonce);
+ return NonceValidation.REPLAY;
+ }
+
+ return NonceValidation.VALID;
+ }
+
+ /**
+ * Nonce validation results.
+ */
+ public enum NonceValidation {
+ /** Nonce is valid and has been recorded */
+ VALID,
+ /** Timestamp is too old (exceeds maxAge) */
+ TOO_OLD,
+ /** Timestamp is too far in the future (clock skew) */
+ CLOCK_SKEW,
+ /** Nonce was already used (replay attempt) */
+ REPLAY
+ }
+}
diff --git a/src/main/java/ai/labs/eddi/configs/agents/crypto/SignedEnvelope.java b/src/main/java/ai/labs/eddi/configs/agents/crypto/SignedEnvelope.java
new file mode 100644
index 000000000..4ce9063cf
--- /dev/null
+++ b/src/main/java/ai/labs/eddi/configs/agents/crypto/SignedEnvelope.java
@@ -0,0 +1,103 @@
+/*
+ * Copyright EDDI contributors
+ * SPDX-License-Identifier: Apache-2.0
+ */
+package ai.labs.eddi.configs.agents.crypto;
+
+import com.fasterxml.jackson.annotation.JsonInclude;
+import com.fasterxml.jackson.core.JsonProcessingException;
+
+import java.time.Instant;
+import java.util.Map;
+import java.util.UUID;
+
+/**
+ * Immutable signed envelope for inter-agent communication.
+ *
+ * Lifecycle:
+ *
+ *
{@link #forSigning(String, String, Map)} creates an unsigned envelope
+ * with a fresh nonce and timestamp
+ *
Compute canonical form via {@link JacksonCanonicalizer} for signing
+ *
{@link #withSignature(String, int)} attaches the signature and key
+ * version
+ *
+ *
+ * @param senderId
+ * the agent ID of the sender
+ * @param recipientId
+ * the agent ID of the intended recipient
+ * @param payload
+ * the message payload (arbitrary key-value pairs)
+ * @param nonce
+ * unique nonce for replay protection
+ * @param timestampMs
+ * epoch milliseconds when the envelope was created
+ * @param signature
+ * Base64-encoded Ed25519 signature (null before signing)
+ * @param keyVersion
+ * the version of the key used for signing (0 before signing)
+ *
+ * @since 6.0.0
+ */
+@JsonInclude(JsonInclude.Include.NON_NULL)
+public record SignedEnvelope(
+ String senderId,
+ String recipientId,
+ Map payload,
+ String nonce,
+ long timestampMs,
+ String signature,
+ int keyVersion) {
+
+ /**
+ * Create an unsigned envelope ready for signing.
+ *
+ * @param senderId
+ * the sender agent ID
+ * @param recipientId
+ * the recipient agent ID
+ * @param payload
+ * the message payload
+ * @return an unsigned envelope with a fresh nonce and current timestamp
+ */
+ public static SignedEnvelope forSigning(String senderId, String recipientId, Map payload) {
+ return new SignedEnvelope(
+ senderId,
+ recipientId,
+ payload,
+ UUID.randomUUID().toString(),
+ Instant.now().toEpochMilli(),
+ null, // no signature yet
+ 0);
+ }
+
+ /**
+ * Attach a signature to this envelope.
+ *
+ * @param signature
+ * Base64-encoded Ed25519 signature
+ * @param keyVersion
+ * the version of the key used
+ * @return a new envelope with the signature attached
+ */
+ public SignedEnvelope withSignature(String signature, int keyVersion) {
+ return new SignedEnvelope(senderId, recipientId, payload, nonce, timestampMs, signature, keyVersion);
+ }
+
+ /**
+ * Get the canonical form of this envelope for signing/verification.
+ *
+ * The canonical form includes all fields except {@code signature} and
+ * {@code keyVersion} to prevent circular dependency.
+ *
+ * @return canonical JSON string
+ * @throws JsonProcessingException
+ * if canonicalization fails
+ */
+ public String canonicalForm() throws JsonProcessingException {
+ // Create a copy without signature fields for canonical form
+ var forCanon = new SignedEnvelope(senderId, recipientId, payload, nonce, timestampMs, null, 0);
+ return JacksonCanonicalizer.canonicalizeObject(forCanon);
+ }
+}
diff --git a/src/main/java/ai/labs/eddi/configs/agents/model/AgentConfiguration.java b/src/main/java/ai/labs/eddi/configs/agents/model/AgentConfiguration.java
index 7b4ca6a88..97a7adfcf 100644
--- a/src/main/java/ai/labs/eddi/configs/agents/model/AgentConfiguration.java
+++ b/src/main/java/ai/labs/eddi/configs/agents/model/AgentConfiguration.java
@@ -326,7 +326,6 @@ public String getKeyValidAt(long epochMs) {
*/
public static class SecurityConfig {
private boolean signInterAgentMessages = false;
- private boolean signMcpInvocations = false;
private boolean requirePeerVerification = false;
public boolean isSignInterAgentMessages() {
@@ -337,14 +336,6 @@ public void setSignInterAgentMessages(boolean signInterAgentMessages) {
this.signInterAgentMessages = signInterAgentMessages;
}
- public boolean isSignMcpInvocations() {
- return signMcpInvocations;
- }
-
- public void setSignMcpInvocations(boolean signMcpInvocations) {
- this.signMcpInvocations = signMcpInvocations;
- }
-
public boolean isRequirePeerVerification() {
return requirePeerVerification;
}
@@ -679,15 +670,16 @@ public void setOnFailure(String onFailure) {
}
/**
- * Session management configuration. Controls automatic checkpointing and
- * conversation forking.
+ * Session management configuration. Controls automatic checkpointing.
+ *
+ * Note: Conversation forking (session branching) is planned
+ * for a future release. When implemented, forking config fields will be added
+ * here alongside the implementation.
*
* @since 6.0.0
*/
public static class SessionManagement {
private AutoSnapshot autoSnapshot;
- private boolean forkingEnabled = false;
- private int maxForksPerConversation = 5;
private int maxCheckpointsPerConversation = 10;
public AutoSnapshot getAutoSnapshot() {
@@ -698,22 +690,6 @@ public void setAutoSnapshot(AutoSnapshot autoSnapshot) {
this.autoSnapshot = autoSnapshot;
}
- public boolean isForkingEnabled() {
- return forkingEnabled;
- }
-
- public void setForkingEnabled(boolean forkingEnabled) {
- this.forkingEnabled = forkingEnabled;
- }
-
- public int getMaxForksPerConversation() {
- return maxForksPerConversation;
- }
-
- public void setMaxForksPerConversation(int maxForksPerConversation) {
- this.maxForksPerConversation = maxForksPerConversation;
- }
-
public int getMaxCheckpointsPerConversation() {
return maxCheckpointsPerConversation;
}
diff --git a/src/main/java/ai/labs/eddi/configs/agents/rest/RestAgentStore.java b/src/main/java/ai/labs/eddi/configs/agents/rest/RestAgentStore.java
index 92758941e..d92e3c04e 100644
--- a/src/main/java/ai/labs/eddi/configs/agents/rest/RestAgentStore.java
+++ b/src/main/java/ai/labs/eddi/configs/agents/rest/RestAgentStore.java
@@ -291,20 +291,22 @@ public IResourceId getCurrentResourceId(String id) throws IResourceStore.Resourc
}
/**
- * Validate that an agent's cryptographic security flags are backed by actual
- * signing infrastructure. If any signing flag is enabled, the agent must have a
- * signing key registered via {@code AgentSigningService.generateKeyPair()}.
+ * Validate that cryptographic security flags are backed by a signing keypair.
+ *
+ * Both {@code signInterAgentMessages} and {@code requirePeerVerification}
+ * require an Ed25519 keypair on the agent's identity block. This validation
+ * prevents enabling signing without the necessary infrastructure.
*
* @throws jakarta.ws.rs.BadRequestException
- * if signing is enabled but no key exists
+ * if crypto is enabled without a keypair
*/
private void validateSecurityFlags(AgentConfiguration config) {
if (config.getSecurity() == null) {
return;
}
var security = config.getSecurity();
+
boolean anyCryptoEnabled = security.isSignInterAgentMessages()
- || security.isSignMcpInvocations()
|| security.isRequirePeerVerification();
if (!anyCryptoEnabled) {
return;
@@ -321,7 +323,7 @@ private void validateSecurityFlags(AgentConfiguration config) {
throw new jakarta.ws.rs.BadRequestException(
"Cryptographic identity features require a signing key. "
+ "Generate one via POST /agentstore/{agentId}/signing/keys "
- + "before enabling signInterAgentMessages, signMcpInvocations, "
+ + "before enabling signInterAgentMessages "
+ "or requirePeerVerification.");
}
}
diff --git a/src/main/java/ai/labs/eddi/configs/groups/model/GroupConversation.java b/src/main/java/ai/labs/eddi/configs/groups/model/GroupConversation.java
index 4ed48d725..1d9ec58e7 100644
--- a/src/main/java/ai/labs/eddi/configs/groups/model/GroupConversation.java
+++ b/src/main/java/ai/labs/eddi/configs/groups/model/GroupConversation.java
@@ -56,17 +56,41 @@ public class GroupConversation {
* @param signature
* Base64-encoded Ed25519 signature if the agent has
* {@code signInterAgentMessages=true}, null otherwise
+ * @param signatureNonce
+ * UUID nonce for replay protection (null if unsigned)
+ * @param signatureTimestampMs
+ * epoch milliseconds when the envelope was signed (null if unsigned)
+ * @param signatureKeyVersion
+ * version of the signing key used (null if unsigned)
*/
public record TranscriptEntry(String speakerAgentId, String speakerDisplayName, String content, int phaseIndex, String phaseName,
- TranscriptEntryType type, Instant timestamp, String errorReason, String targetAgentId, String signature) {
+ TranscriptEntryType type, Instant timestamp, String errorReason, String targetAgentId, String signature,
+ String signatureNonce, Long signatureTimestampMs, Integer signatureKeyVersion) {
/**
- * Backward-compatible constructor without signature (defaults to null).
+ * Backward-compatible constructor without any signature fields.
*/
public TranscriptEntry(String speakerAgentId, String speakerDisplayName, String content, int phaseIndex, String phaseName,
TranscriptEntryType type, Instant timestamp, String errorReason, String targetAgentId) {
this(speakerAgentId, speakerDisplayName, content, phaseIndex, phaseName,
- type, timestamp, errorReason, targetAgentId, null);
+ type, timestamp, errorReason, targetAgentId, null, null, null, null);
+ }
+
+ /**
+ * Backward-compatible constructor with signature only (no envelope data).
+ */
+ public TranscriptEntry(String speakerAgentId, String speakerDisplayName, String content, int phaseIndex, String phaseName,
+ TranscriptEntryType type, Instant timestamp, String errorReason, String targetAgentId, String signature) {
+ this(speakerAgentId, speakerDisplayName, content, phaseIndex, phaseName,
+ type, timestamp, errorReason, targetAgentId, signature, null, null, null);
+ }
+
+ /**
+ * Check whether this entry has full envelope data (signature + nonce +
+ * timestamp) suitable for cryptographic verification.
+ */
+ public boolean hasEnvelopeData() {
+ return signature != null && signatureNonce != null && signatureTimestampMs != null;
}
}
diff --git a/src/main/java/ai/labs/eddi/datastore/DataStoreProducers.java b/src/main/java/ai/labs/eddi/datastore/DataStoreProducers.java
index 4127d05c3..2f08732d3 100644
--- a/src/main/java/ai/labs/eddi/datastore/DataStoreProducers.java
+++ b/src/main/java/ai/labs/eddi/datastore/DataStoreProducers.java
@@ -172,4 +172,12 @@ public IAttachmentStore attachmentStore(
Instance postgres) {
return isPostgres() ? postgres.get() : mongo.get();
}
+
+ @Produces
+ @ApplicationScoped
+ public ai.labs.eddi.engine.tenancy.ITenantQuotaStore tenantQuotaStore(
+ Instance mongo,
+ Instance postgres) {
+ return isPostgres() ? postgres.get() : mongo.get();
+ }
}
diff --git a/src/main/java/ai/labs/eddi/engine/internal/GroupConversationService.java b/src/main/java/ai/labs/eddi/engine/internal/GroupConversationService.java
index 9a534250e..e819615b0 100644
--- a/src/main/java/ai/labs/eddi/engine/internal/GroupConversationService.java
+++ b/src/main/java/ai/labs/eddi/engine/internal/GroupConversationService.java
@@ -6,6 +6,10 @@
import ai.labs.eddi.configs.agents.AgentSigningService;
import ai.labs.eddi.configs.agents.IAgentStore;
+import ai.labs.eddi.configs.agents.crypto.AgentPublicKey;
+import ai.labs.eddi.configs.agents.crypto.NonceCacheService;
+import ai.labs.eddi.configs.agents.crypto.SignedEnvelope;
+import ai.labs.eddi.utils.LogSanitizer;
import ai.labs.eddi.configs.groups.IAgentGroupStore;
import ai.labs.eddi.configs.groups.IGroupConversationStore;
@@ -74,8 +78,14 @@ public class GroupConversationService implements IGroupConversationService {
private final ExecutorService executorService;
private final AgentSigningService agentSigningService;
private final IAgentStore agentStore;
+ private final NonceCacheService nonceCacheService;
private final String defaultTenantId;
+ // Incremental peer verification: tracks the last verified transcript index
+ // per group conversation ID, so we only verify new entries each turn (O(N)
+ // amortized instead of O(NΒ²)). Cleaned up when conversations complete.
+ private final ConcurrentHashMap lastVerifiedIndex = new ConcurrentHashMap<>();
+
// Metrics
private final Timer timerGroupDiscussion;
private final Counter counterGroupDiscussion;
@@ -85,6 +95,7 @@ public class GroupConversationService implements IGroupConversationService {
public GroupConversationService(IAgentGroupStore groupStore, IGroupConversationStore conversationStore, IConversationService conversationService,
IAgentFactory agentFactory, ITemplatingEngine templatingEngine, IJsonSerialization jsonSerialization, MeterRegistry meterRegistry,
AgentSigningService agentSigningService, IAgentStore agentStore,
+ NonceCacheService nonceCacheService,
@ConfigProperty(name = "eddi.tenant.default-id", defaultValue = "default") String defaultTenantId,
@ConfigProperty(name = "eddi.groups.max-depth", defaultValue = "3") int maxDepth) {
this.groupStore = groupStore;
@@ -96,6 +107,7 @@ public GroupConversationService(IAgentGroupStore groupStore, IGroupConversationS
this.maxDepth = maxDepth;
this.agentSigningService = agentSigningService;
this.agentStore = agentStore;
+ this.nonceCacheService = nonceCacheService;
this.defaultTenantId = defaultTenantId;
// Virtual threads β lightweight, no pool sizing, ideal for parallel agent calls
this.executorService = Executors.newVirtualThreadPerTaskExecutor();
@@ -313,6 +325,8 @@ private GroupConversation executeDiscussion(GroupConversation gc, AgentGroupConf
throw new GroupDiscussionException("Group discussion failed: " + e.getMessage(), e);
} finally {
timerGroupDiscussion.record(System.nanoTime() - startTime, TimeUnit.NANOSECONDS);
+ // Clean up incremental verification cursor β conversation is done
+ lastVerifiedIndex.remove(gc.getId());
}
}
@@ -574,6 +588,11 @@ private TranscriptEntry executeAgentTurn(GroupMember member, GroupConversation g
context.put("groupId", new Context(Context.ContextType.string, gc.getGroupId()));
context.put("groupConversationId", new Context(Context.ContextType.string, gc.getId()));
context.put("groupDepth", new Context(Context.ContextType.string, String.valueOf(gc.getDepth())));
+
+ // Wave 6: Peer verification β if the receiving agent requires it,
+ // verify all signed entries from prior speakers before sending context
+ verifyPriorEntriesIfRequired(member.agentId(), gc);
+
inputData.setContext(context);
// Call through ConversationService with retry
@@ -593,27 +612,93 @@ private TranscriptEntry executeAgentTurn(GroupMember member, GroupConversation g
String response = responseFuture.get(timeout, TimeUnit.SECONDS);
- // Wave 6: Sign inter-agent messages if configured
+ // Wave 6: Sign inter-agent messages with full envelope if configured
String signature = null;
- try {
- var resourceId = agentStore.getCurrentResourceId(member.agentId());
- var agentConfig = agentStore.read(member.agentId(), resourceId.getVersion());
- if (agentConfig.getSecurity() != null
- && agentConfig.getSecurity().isSignInterAgentMessages()
- && response != null) {
- signature = agentSigningService.sign(
- defaultTenantId, member.agentId(), response);
- LOGGER.debugf("Signed inter-agent message from '%s' (sig=%s...)",
- member.agentId(),
- signature.length() > 16 ? signature.substring(0, 16) : signature);
+ String signatureNonce = null;
+ Long signatureTimestampMs = null;
+ Integer signatureKeyVersion = null;
+ // Skip signing if crypto infrastructure is not injected
+ if (agentStore != null && agentSigningService != null && nonceCacheService != null) {
+ try {
+ var resourceId = agentStore.getCurrentResourceId(member.agentId());
+ var agentConfig = agentStore.read(member.agentId(), resourceId.getVersion());
+ if (agentConfig.getSecurity() != null
+ && agentConfig.getSecurity().isSignInterAgentMessages()
+ && response != null) {
+ // Create SignedEnvelope with nonce for replay protection
+ var envelope = SignedEnvelope.forSigning(
+ member.agentId(), gc.getGroupId(),
+ Map.of("content", response, "phase", phase.name()));
+ int keyVersion = 0;
+ if (agentConfig.getIdentity() != null
+ && agentConfig.getIdentity().getKeys() != null
+ && !agentConfig.getIdentity().getKeys().isEmpty()) {
+ keyVersion = agentConfig.getIdentity().getKeys().stream()
+ .mapToInt(AgentPublicKey::version)
+ .max().orElse(0);
+ }
+ var signedEnvelope = agentSigningService.signEnvelope(
+ defaultTenantId, member.agentId(), envelope, keyVersion);
+
+ // Immediate self-verification: sanity-check the signature.
+ // If this fails, the signature is broken β do NOT store it.
+ String publicKey = agentConfig.getIdentity() != null
+ ? agentConfig.getIdentity()
+ .getKeyValidAt(signedEnvelope.timestampMs())
+ : null;
+ if (publicKey != null) {
+ boolean valid = agentSigningService.verifyEnvelope(
+ signedEnvelope, publicKey);
+ if (!valid) {
+ LOGGER.errorf("SELF-VERIFY FAILED for agent '%s' "
+ + "β key mismatch or signing error. "
+ + "Falling back to unsigned entry.",
+ member.agentId());
+ // Fall back to unsigned: do NOT store broken signature
+ signedEnvelope = null;
+ }
+ }
+
+ // Nonce validation: register nonce to prevent replay.
+ // If validation fails (stale/skewed), discard the signature.
+ if (signedEnvelope != null) {
+ var nonceResult = nonceCacheService.validate(
+ signedEnvelope.nonce(), signedEnvelope.timestampMs());
+ if (nonceResult != NonceCacheService.NonceValidation.VALID) {
+ LOGGER.warnf("Nonce validation failed for agent '%s': %s "
+ + "β falling back to unsigned entry",
+ member.agentId(), nonceResult);
+ signedEnvelope = null;
+ }
+ }
+
+ // Store full envelope data for peer verification
+ if (signedEnvelope != null) {
+ signature = signedEnvelope.signature();
+ signatureNonce = signedEnvelope.nonce();
+ signatureTimestampMs = signedEnvelope.timestampMs();
+ signatureKeyVersion = signedEnvelope.keyVersion();
+
+ LOGGER.debugf("Signed inter-agent envelope from '%s' "
+ + "(nonce=%s, keyV=%d, sig=%s...)",
+ member.agentId(), signatureNonce,
+ signatureKeyVersion,
+ signature.length() > 16
+ ? signature.substring(0, 16)
+ : signature);
+ }
+ }
+ } catch (Exception sigEx) {
+ LOGGER.warnf("Failed to sign message from agent '%s': %s",
+ member.agentId(), sigEx.getMessage());
}
- } catch (Exception sigEx) {
- LOGGER.warnf("Failed to sign message from agent '%s': %s",
- member.agentId(), sigEx.getMessage());
}
- var entry = new TranscriptEntry(member.agentId(), member.displayName(), response, phaseIdx, phase.name(), entryType, Instant.now(),
- null, targetAgentId, signature);
+ var entry = new TranscriptEntry(
+ member.agentId(), member.displayName(), response,
+ phaseIdx, phase.name(), entryType, Instant.now(),
+ null, targetAgentId, signature,
+ signatureNonce, signatureTimestampMs, signatureKeyVersion);
return entry;
} catch (TimeoutException e) {
@@ -976,4 +1061,137 @@ private String buildPlainTextFallback(DiscussionPhase phase, GroupMember speaker
sb.append(", please contribute to this phase of the discussion.");
return sb.toString();
}
+
+ /**
+ * Verify signed transcript entries from prior speakers if the receiving agent
+ * has {@code requirePeerVerification=true}.
+ *
+ * For each signed entry with full envelope data, this method:
+ *
+ *
Reconstructs the
+ * {@link ai.labs.eddi.configs.agents.crypto.SignedEnvelope} from stored
+ * fields
+ *
Loads the speaker's public key from the agent config
+ *
Verifies the signature against the canonical envelope form
+ *
+ * Invalid signatures are logged as security warnings. This is defense-in-depth:
+ * the signing code already self-verifies at creation time, so failures here
+ * indicate either key rotation issues or data corruption.
+ *
+ * @param receivingAgentId
+ * the agent about to receive the transcript
+ * @param gc
+ * the group conversation containing the transcript
+ */
+ private void verifyPriorEntriesIfRequired(String receivingAgentId, GroupConversation gc) {
+ // Skip if crypto infrastructure is not injected
+ if (agentStore == null || agentSigningService == null) {
+ return;
+ }
+ try {
+ var resourceId = agentStore.getCurrentResourceId(receivingAgentId);
+ if (resourceId == null) {
+ return;
+ }
+ var receiverConfig = agentStore.read(receivingAgentId, resourceId.getVersion());
+ if (receiverConfig.getSecurity() == null
+ || !receiverConfig.getSecurity().isRequirePeerVerification()) {
+ return;
+ }
+
+ List transcript = gc.getTranscript();
+ int totalEntries = transcript.size();
+
+ // Incremental verification: only verify entries added since last check
+ int startIdx = lastVerifiedIndex.getOrDefault(gc.getId(), 0);
+ if (startIdx >= totalEntries) {
+ return; // Nothing new to verify
+ }
+
+ LOGGER.debugf("Peer verification for agent '%s' β verifying entries %d..%d (of %d total)",
+ receivingAgentId, startIdx, totalEntries - 1, totalEntries);
+
+ int verified = 0;
+ int failed = 0;
+ int unsigned = 0;
+
+ // Cache public keys per speaker to avoid redundant agentStore reads
+ Map publicKeyCache = new HashMap<>();
+
+ for (int i = startIdx; i < totalEntries; i++) {
+ TranscriptEntry entry = transcript.get(i);
+ // Skip non-agent entries (user questions, errors, etc.)
+ if ("user".equals(entry.speakerAgentId()) || entry.content() == null) {
+ continue;
+ }
+
+ if (!entry.hasEnvelopeData()) {
+ unsigned++;
+ LOGGER.warnf("UNSIGNED entry from agent '%s' in group '%s' β "
+ + "peer verification required but entry has no envelope data",
+ entry.speakerAgentId(), LogSanitizer.sanitize(gc.getGroupId()));
+ continue;
+ }
+
+ // Reconstruct envelope for verification
+ var envelope = new SignedEnvelope(
+ entry.speakerAgentId(), gc.getGroupId(),
+ Map.of("content", entry.content(), "phase", entry.phaseName()),
+ entry.signatureNonce(), entry.signatureTimestampMs(),
+ entry.signature(), entry.signatureKeyVersion());
+
+ // Get speaker's public key (cached per speaker)
+ try {
+ String publicKey = publicKeyCache.computeIfAbsent(entry.speakerAgentId(), agentId -> {
+ try {
+ var speakerResourceId = agentStore.getCurrentResourceId(agentId);
+ if (speakerResourceId == null) {
+ return null;
+ }
+ var speakerConfig = agentStore.read(agentId, speakerResourceId.getVersion());
+ return speakerConfig.getIdentity() != null
+ ? speakerConfig.getIdentity()
+ .getKeyValidAt(entry.signatureTimestampMs())
+ : null;
+ } catch (Exception e) {
+ LOGGER.warnf("Error loading public key for agent '%s': %s",
+ agentId, e.getMessage());
+ return null;
+ }
+ });
+
+ if (publicKey == null) {
+ LOGGER.warnf("No public key found for agent '%s' β cannot verify signature",
+ entry.speakerAgentId());
+ failed++;
+ continue;
+ }
+
+ boolean valid = agentSigningService.verifyEnvelope(envelope, publicKey);
+ if (valid) {
+ verified++;
+ } else {
+ failed++;
+ LOGGER.errorf("SIGNATURE VERIFICATION FAILED for entry from agent '%s' "
+ + "(nonce=%s, keyV=%d) β potential tampering or key rotation issue",
+ entry.speakerAgentId(), entry.signatureNonce(),
+ entry.signatureKeyVersion());
+ }
+ } catch (Exception e) {
+ failed++;
+ LOGGER.warnf("Error verifying entry from agent '%s': %s",
+ entry.speakerAgentId(), e.getMessage());
+ }
+ }
+
+ // Update the cursor for this conversation
+ lastVerifiedIndex.put(gc.getId(), totalEntries);
+
+ LOGGER.infof("Peer verification for agent '%s': %d verified, %d failed, %d unsigned (range %d..%d)",
+ receivingAgentId, verified, failed, unsigned, startIdx, totalEntries - 1);
+ } catch (Exception e) {
+ LOGGER.warnf("Peer verification check failed for agent '%s': %s",
+ receivingAgentId, e.getMessage());
+ }
+ }
}
diff --git a/src/main/java/ai/labs/eddi/engine/tenancy/MongoTenantQuotaStore.java b/src/main/java/ai/labs/eddi/engine/tenancy/MongoTenantQuotaStore.java
new file mode 100644
index 000000000..a23151983
--- /dev/null
+++ b/src/main/java/ai/labs/eddi/engine/tenancy/MongoTenantQuotaStore.java
@@ -0,0 +1,284 @@
+/*
+ * Copyright EDDI contributors
+ * SPDX-License-Identifier: Apache-2.0
+ */
+package ai.labs.eddi.engine.tenancy;
+
+import ai.labs.eddi.engine.tenancy.model.QuotaCheckResult;
+import ai.labs.eddi.engine.tenancy.model.TenantQuota;
+import ai.labs.eddi.engine.tenancy.model.UsageSnapshot;
+import ai.labs.eddi.utils.LogSanitizer;
+import com.mongodb.client.MongoCollection;
+import com.mongodb.client.MongoDatabase;
+import com.mongodb.client.model.Filters;
+import com.mongodb.client.model.FindOneAndUpdateOptions;
+import com.mongodb.client.model.ReturnDocument;
+import com.mongodb.client.model.Updates;
+import io.quarkus.arc.DefaultBean;
+import jakarta.enterprise.context.ApplicationScoped;
+import jakarta.inject.Inject;
+import org.bson.Document;
+import org.jboss.logging.Logger;
+
+import java.time.Instant;
+import java.time.YearMonth;
+import java.time.ZoneOffset;
+import java.time.temporal.ChronoUnit;
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * MongoDB-backed tenant quota store. Uses {@code findOneAndModify} for atomic
+ * counter operations β safe for multi-instance deployments.
+ *
{@code tenant_usage} β rolling usage counters (daily conversations,
+ * per-minute API calls, monthly cost)
+ *
+ *
+ * @since 6.0.0
+ */
+@DefaultBean
+@ApplicationScoped
+public class MongoTenantQuotaStore implements ITenantQuotaStore {
+
+ private static final Logger LOGGER = Logger.getLogger(MongoTenantQuotaStore.class);
+ private static final String QUOTAS_COLLECTION = "tenant_quotas";
+ private static final String USAGE_COLLECTION = "tenant_usage";
+
+ private final MongoCollection quotas;
+ private final MongoCollection usage;
+
+ @Inject
+ public MongoTenantQuotaStore(MongoDatabase database) {
+ this.quotas = database.getCollection(QUOTAS_COLLECTION);
+ this.usage = database.getCollection(USAGE_COLLECTION);
+
+ // Ensure unique index on tenantId to prevent duplicate rows from upsert races
+ var indexOptions = new com.mongodb.client.model.IndexOptions().unique(true);
+ quotas.createIndex(new Document("tenantId", 1), indexOptions);
+ usage.createIndex(new Document("tenantId", 1), indexOptions);
+ }
+
+ // βββ Quota Configuration βββ
+
+ @Override
+ public TenantQuota getQuota(String tenantId) {
+ Document doc = quotas.find(Filters.eq("tenantId", tenantId)).first();
+ return doc != null ? toQuota(doc) : null;
+ }
+
+ @Override
+ public void setQuota(TenantQuota quota) {
+ quotas.findOneAndUpdate(
+ Filters.eq("tenantId", quota.tenantId()),
+ Updates.combine(
+ Updates.set("tenantId", quota.tenantId()),
+ Updates.set("maxConversationsPerDay", quota.maxConversationsPerDay()),
+ Updates.set("maxAgentsPerTenant", quota.maxAgentsPerTenant()),
+ Updates.set("maxApiCallsPerMinute", quota.maxApiCallsPerMinute()),
+ Updates.set("maxMonthlyCostUsd", quota.maxMonthlyCostUsd()),
+ Updates.set("enabled", quota.enabled())),
+ new FindOneAndUpdateOptions().upsert(true));
+ }
+
+ @Override
+ public List listQuotas() {
+ List result = new ArrayList<>();
+ for (Document doc : quotas.find()) {
+ result.add(toQuota(doc));
+ }
+ return result;
+ }
+
+ @Override
+ public void deleteQuota(String tenantId) {
+ quotas.deleteOne(Filters.eq("tenantId", tenantId));
+ usage.deleteOne(Filters.eq("tenantId", tenantId));
+ }
+
+ // βββ Atomic Usage Operations βββ
+ //
+ // Note: The increment methods use two sequential findOneAndUpdate calls
+ // (1: increment if in window + under limit, 2: reset if stale window).
+ // There is a minor TOCTOU race at window boundaries in multi-instance
+ // deployments: between call 1 and call 2, another instance may reset the
+ // window. This can cause a single false denial per window transition.
+ // This is acceptable for quota enforcement β the consequence is one
+ // request getting a "limit reached" response at a boundary that would
+ // succeed on retry. Not a data corruption risk.
+
+ @Override
+ public QuotaCheckResult tryIncrementConversations(String tenantId, int limit) {
+ if (limit < 0) {
+ return QuotaCheckResult.OK;
+ }
+
+ Instant dayStart = Instant.now().truncatedTo(ChronoUnit.DAYS);
+
+ // Atomic: reset if expired + increment if under limit
+ Document result = usage.findOneAndUpdate(
+ Filters.and(
+ Filters.eq("tenantId", tenantId),
+ Filters.gte("dayStart", dayStart.toEpochMilli()),
+ Filters.lt("conversationsToday", limit)),
+ Updates.combine(
+ Updates.inc("conversationsToday", 1),
+ Updates.setOnInsert("tenantId", tenantId),
+ Updates.setOnInsert("dayStart", dayStart.toEpochMilli())),
+ new FindOneAndUpdateOptions().upsert(true).returnDocument(ReturnDocument.AFTER));
+
+ if (result == null) {
+ // Slot not acquired β check if it's a window reset or a real limit breach
+ Document existing = usage.findOneAndUpdate(
+ Filters.and(
+ Filters.eq("tenantId", tenantId),
+ Filters.lt("dayStart", dayStart.toEpochMilli())),
+ Updates.combine(
+ Updates.set("conversationsToday", 1),
+ Updates.set("dayStart", dayStart.toEpochMilli())),
+ new FindOneAndUpdateOptions().returnDocument(ReturnDocument.AFTER));
+
+ if (existing != null) {
+ return QuotaCheckResult.OK; // Window was stale, reset succeeded
+ }
+ return QuotaCheckResult.denied("Daily conversation limit reached (" + limit + ")");
+ }
+ return QuotaCheckResult.OK;
+ }
+
+ @Override
+ public QuotaCheckResult tryIncrementApiCalls(String tenantId, int limit) {
+ if (limit < 0) {
+ return QuotaCheckResult.OK;
+ }
+
+ long minuteStart = Instant.now().truncatedTo(ChronoUnit.MINUTES).toEpochMilli();
+
+ Document result = usage.findOneAndUpdate(
+ Filters.and(
+ Filters.eq("tenantId", tenantId),
+ Filters.gte("minuteStart", minuteStart),
+ Filters.lt("apiCallsThisMinute", limit)),
+ Updates.combine(
+ Updates.inc("apiCallsThisMinute", 1),
+ Updates.setOnInsert("tenantId", tenantId),
+ Updates.setOnInsert("minuteStart", minuteStart)),
+ new FindOneAndUpdateOptions().upsert(true).returnDocument(ReturnDocument.AFTER));
+
+ if (result == null) {
+ Document existing = usage.findOneAndUpdate(
+ Filters.and(
+ Filters.eq("tenantId", tenantId),
+ Filters.lt("minuteStart", minuteStart)),
+ Updates.combine(
+ Updates.set("apiCallsThisMinute", 1),
+ Updates.set("minuteStart", minuteStart)),
+ new FindOneAndUpdateOptions().returnDocument(ReturnDocument.AFTER));
+
+ if (existing != null) {
+ return QuotaCheckResult.OK;
+ }
+ return QuotaCheckResult.denied("API rate limit reached (" + limit + "/min)");
+ }
+ return QuotaCheckResult.OK;
+ }
+
+ @Override
+ public QuotaCheckResult tryAddCost(String tenantId, double cost, double limit) {
+ YearMonth currentMonth = YearMonth.now(ZoneOffset.UTC);
+ String monthKey = currentMonth.toString();
+
+ // Always add the cost (post-call accounting)
+ Document result = usage.findOneAndUpdate(
+ Filters.and(
+ Filters.eq("tenantId", tenantId),
+ Filters.eq("costMonth", monthKey)),
+ Updates.combine(
+ Updates.inc("monthlyCostUsd", cost),
+ Updates.setOnInsert("tenantId", tenantId),
+ Updates.setOnInsert("costMonth", monthKey)),
+ new FindOneAndUpdateOptions().upsert(true).returnDocument(ReturnDocument.AFTER));
+
+ if (result == null) {
+ // Stale month β reset
+ usage.findOneAndUpdate(
+ Filters.eq("tenantId", tenantId),
+ Updates.combine(
+ Updates.set("monthlyCostUsd", cost),
+ Updates.set("costMonth", monthKey)),
+ new FindOneAndUpdateOptions().upsert(true));
+ return QuotaCheckResult.OK;
+ }
+
+ double totalCost = result.getDouble("monthlyCostUsd");
+ if (limit >= 0 && totalCost > limit) {
+ return QuotaCheckResult.denied(
+ "Monthly cost budget exceeded ($%.2f / $%.2f)".formatted(totalCost, limit));
+ }
+ return QuotaCheckResult.OK;
+ }
+
+ // βββ Usage Reporting βββ
+
+ @Override
+ public UsageSnapshot getUsage(String tenantId) {
+ Document doc = usage.find(Filters.eq("tenantId", tenantId)).first();
+ if (doc == null) {
+ return UsageSnapshot.empty(tenantId);
+ }
+ return toSnapshot(tenantId, doc);
+ }
+
+ @Override
+ public double getMonthlyCost(String tenantId) {
+ Document doc = usage.find(Filters.eq("tenantId", tenantId)).first();
+ if (doc == null) {
+ return 0.0;
+ }
+ YearMonth currentMonth = YearMonth.now(ZoneOffset.UTC);
+ String monthKey = doc.getString("costMonth");
+ if (monthKey == null || !monthKey.equals(currentMonth.toString())) {
+ return 0.0; // Stale month
+ }
+ return doc.getDouble("monthlyCostUsd") != null ? doc.getDouble("monthlyCostUsd") : 0.0;
+ }
+
+ @Override
+ public void resetUsage(String tenantId) {
+ usage.deleteOne(Filters.eq("tenantId", tenantId));
+ LOGGER.infof("Reset usage counters for tenant '%s'", LogSanitizer.sanitize(tenantId));
+ }
+
+ // βββ Mapping βββ
+
+ private TenantQuota toQuota(Document doc) {
+ return new TenantQuota(
+ doc.getString("tenantId"),
+ doc.getInteger("maxConversationsPerDay", -1),
+ doc.getInteger("maxAgentsPerTenant", -1),
+ doc.getInteger("maxApiCallsPerMinute", -1),
+ doc.getDouble("maxMonthlyCostUsd") != null ? doc.getDouble("maxMonthlyCostUsd") : -1.0,
+ doc.getBoolean("enabled", false));
+ }
+
+ private UsageSnapshot toSnapshot(String tenantId, Document doc) {
+ Instant minuteStart = doc.getLong("minuteStart") != null
+ ? Instant.ofEpochMilli(doc.getLong("minuteStart"))
+ : Instant.now();
+ Instant dayStart = doc.getLong("dayStart") != null
+ ? Instant.ofEpochMilli(doc.getLong("dayStart"))
+ : Instant.now();
+ YearMonth costMonth = doc.getString("costMonth") != null
+ ? YearMonth.parse(doc.getString("costMonth"))
+ : YearMonth.now(ZoneOffset.UTC);
+ return new UsageSnapshot(
+ tenantId,
+ doc.getInteger("conversationsToday", 0),
+ doc.getInteger("apiCallsThisMinute", 0),
+ doc.getDouble("monthlyCostUsd") != null ? doc.getDouble("monthlyCostUsd") : 0.0,
+ minuteStart, dayStart, costMonth);
+ }
+}
diff --git a/src/main/java/ai/labs/eddi/engine/tenancy/PostgresTenantQuotaStore.java b/src/main/java/ai/labs/eddi/engine/tenancy/PostgresTenantQuotaStore.java
new file mode 100644
index 000000000..1247acfea
--- /dev/null
+++ b/src/main/java/ai/labs/eddi/engine/tenancy/PostgresTenantQuotaStore.java
@@ -0,0 +1,427 @@
+/*
+ * Copyright EDDI contributors
+ * SPDX-License-Identifier: Apache-2.0
+ */
+package ai.labs.eddi.engine.tenancy;
+
+import ai.labs.eddi.engine.tenancy.model.QuotaCheckResult;
+import ai.labs.eddi.engine.tenancy.model.TenantQuota;
+import ai.labs.eddi.engine.tenancy.model.UsageSnapshot;
+import static ai.labs.eddi.utils.LogSanitizer.sanitize;
+import io.quarkus.arc.DefaultBean;
+import jakarta.enterprise.context.ApplicationScoped;
+import jakarta.enterprise.inject.Instance;
+import jakarta.inject.Inject;
+import org.jboss.logging.Logger;
+
+import javax.sql.DataSource;
+import java.sql.*;
+import java.time.Instant;
+import java.time.YearMonth;
+import java.time.ZoneOffset;
+import java.time.temporal.ChronoUnit;
+import java.util.ArrayList;
+import java.util.List;
+
+/**
+ * PostgreSQL-backed tenant quota store. Uses
+ * {@code UPDATE ... WHERE ... RETURNING} for atomic counter operations β safe
+ * for multi-instance deployments.
+ *
+ * Schema is auto-created via {@code CREATE TABLE IF NOT EXISTS} on first
+ * access, following the established pattern from
+ * {@code PostgresGlobalVariableStore}.
+ *
+ * Tables:
+ *
+ *
{@code tenant_quotas} β quota configuration
+ *
{@code tenant_usage} β rolling usage counters
+ *
+ *
+ * @since 6.0.0
+ */
+@DefaultBean
+@ApplicationScoped
+public class PostgresTenantQuotaStore implements ITenantQuotaStore {
+
+ private static final Logger LOGGER = Logger.getLogger(PostgresTenantQuotaStore.class);
+
+ private static final String CREATE_QUOTAS_TABLE = """
+ CREATE TABLE IF NOT EXISTS tenant_quotas (
+ tenant_id VARCHAR(255) PRIMARY KEY,
+ max_conversations_per_day INT NOT NULL DEFAULT -1,
+ max_agents_per_tenant INT NOT NULL DEFAULT -1,
+ max_api_calls_per_minute INT NOT NULL DEFAULT -1,
+ max_monthly_cost_usd DOUBLE PRECISION NOT NULL DEFAULT -1.0,
+ enabled BOOLEAN NOT NULL DEFAULT TRUE
+ )
+ """;
+
+ private static final String CREATE_USAGE_TABLE = """
+ CREATE TABLE IF NOT EXISTS tenant_usage (
+ tenant_id VARCHAR(255) PRIMARY KEY,
+ conversations_today INT NOT NULL DEFAULT 0,
+ day_start BIGINT NOT NULL DEFAULT 0,
+ api_calls_this_minute INT NOT NULL DEFAULT 0,
+ minute_start BIGINT NOT NULL DEFAULT 0,
+ monthly_cost_usd DOUBLE PRECISION NOT NULL DEFAULT 0.0,
+ cost_month VARCHAR(10)
+ )
+ """;
+
+ private final Instance dataSourceInstance;
+ private volatile boolean schemaInitialized = false;
+
+ @Inject
+ public PostgresTenantQuotaStore(Instance dataSourceInstance) {
+ this.dataSourceInstance = dataSourceInstance;
+ }
+
+ private synchronized void ensureSchema() {
+ if (schemaInitialized)
+ return;
+ try (Connection conn = dataSourceInstance.get().getConnection();
+ Statement stmt = conn.createStatement()) {
+ stmt.execute(CREATE_QUOTAS_TABLE);
+ stmt.execute(CREATE_USAGE_TABLE);
+ schemaInitialized = true;
+ LOGGER.info("PostgresTenantQuotaStore initialized (tables=tenant_quotas, tenant_usage)");
+ } catch (SQLException e) {
+ throw new RuntimeException("Failed to initialize tenant quota tables", e);
+ }
+ }
+
+ // βββ Quota Configuration βββ
+
+ @Override
+ public TenantQuota getQuota(String tenantId) {
+ ensureSchema();
+ try (Connection conn = dataSourceInstance.get().getConnection();
+ PreparedStatement ps = conn.prepareStatement(
+ "SELECT * FROM tenant_quotas WHERE tenant_id = ?")) {
+ ps.setString(1, tenantId);
+ try (ResultSet rs = ps.executeQuery()) {
+ if (rs.next()) {
+ return toQuota(rs);
+ }
+ }
+ } catch (SQLException e) {
+ LOGGER.warnf("Failed to read quota for tenant '%s': %s", sanitize(tenantId), sanitize(e.getMessage()));
+ }
+ return null;
+ }
+
+ @Override
+ public void setQuota(TenantQuota quota) {
+ ensureSchema();
+ try (Connection conn = dataSourceInstance.get().getConnection();
+ PreparedStatement ps = conn.prepareStatement(
+ """
+ INSERT INTO tenant_quotas (tenant_id, max_conversations_per_day, max_agents_per_tenant,
+ max_api_calls_per_minute, max_monthly_cost_usd, enabled)
+ VALUES (?, ?, ?, ?, ?, ?)
+ ON CONFLICT (tenant_id) DO UPDATE SET
+ max_conversations_per_day = EXCLUDED.max_conversations_per_day,
+ max_agents_per_tenant = EXCLUDED.max_agents_per_tenant,
+ max_api_calls_per_minute = EXCLUDED.max_api_calls_per_minute,
+ max_monthly_cost_usd = EXCLUDED.max_monthly_cost_usd,
+ enabled = EXCLUDED.enabled
+ """)) {
+ ps.setString(1, quota.tenantId());
+ ps.setInt(2, quota.maxConversationsPerDay());
+ ps.setInt(3, quota.maxAgentsPerTenant());
+ ps.setInt(4, quota.maxApiCallsPerMinute());
+ ps.setDouble(5, quota.maxMonthlyCostUsd());
+ ps.setBoolean(6, quota.enabled());
+ ps.executeUpdate();
+ } catch (SQLException e) {
+ LOGGER.errorf("Failed to set quota for tenant '%s': %s", sanitize(quota.tenantId()), sanitize(e.getMessage()));
+ }
+ }
+
+ @Override
+ public List listQuotas() {
+ ensureSchema();
+ List result = new ArrayList<>();
+ try (Connection conn = dataSourceInstance.get().getConnection();
+ PreparedStatement ps = conn.prepareStatement("SELECT * FROM tenant_quotas");
+ ResultSet rs = ps.executeQuery()) {
+ while (rs.next()) {
+ result.add(toQuota(rs));
+ }
+ } catch (SQLException e) {
+ LOGGER.warnf("Failed to list quotas: %s", sanitize(e.getMessage()));
+ }
+ return result;
+ }
+
+ @Override
+ public void deleteQuota(String tenantId) {
+ ensureSchema();
+ try (Connection conn = dataSourceInstance.get().getConnection()) {
+ conn.setAutoCommit(false);
+ try {
+ try (PreparedStatement ps = conn.prepareStatement(
+ "DELETE FROM tenant_quotas WHERE tenant_id = ?")) {
+ ps.setString(1, tenantId);
+ ps.executeUpdate();
+ }
+ try (PreparedStatement ps = conn.prepareStatement(
+ "DELETE FROM tenant_usage WHERE tenant_id = ?")) {
+ ps.setString(1, tenantId);
+ ps.executeUpdate();
+ }
+ conn.commit();
+ } catch (SQLException e) {
+ conn.rollback();
+ throw e;
+ } finally {
+ conn.setAutoCommit(true);
+ }
+ } catch (SQLException e) {
+ LOGGER.warnf("Failed to delete quota for tenant '%s': %s",
+ sanitize(tenantId), sanitize(e.getMessage()));
+ }
+ }
+
+ // βββ Atomic Usage Operations βββ
+
+ @Override
+ public QuotaCheckResult tryIncrementConversations(String tenantId, int limit) {
+ if (limit < 0) {
+ return QuotaCheckResult.OK;
+ }
+
+ long dayStartMs = Instant.now().truncatedTo(ChronoUnit.DAYS).toEpochMilli();
+
+ ensureSchema();
+ try (Connection conn = dataSourceInstance.get().getConnection()) {
+ // First: try atomic increment within current window
+ try (PreparedStatement ps = conn.prepareStatement(
+ """
+ UPDATE tenant_usage SET conversations_today = conversations_today + 1
+ WHERE tenant_id = ? AND day_start = ? AND conversations_today < ?
+ RETURNING conversations_today
+ """)) {
+ ps.setString(1, tenantId);
+ ps.setLong(2, dayStartMs);
+ ps.setInt(3, limit);
+ try (ResultSet rs = ps.executeQuery()) {
+ if (rs.next()) {
+ return QuotaCheckResult.OK;
+ }
+ }
+ }
+
+ // Window may be stale β try to reset and increment atomically
+ try (PreparedStatement ps = conn.prepareStatement(
+ """
+ INSERT INTO tenant_usage (tenant_id, conversations_today, day_start, api_calls_this_minute, minute_start, monthly_cost_usd, cost_month)
+ VALUES (?, 1, ?, 0, ?, 0.0, ?)
+ ON CONFLICT (tenant_id) DO UPDATE SET
+ conversations_today = CASE WHEN tenant_usage.day_start < ? THEN 1 ELSE tenant_usage.conversations_today END,
+ day_start = CASE WHEN tenant_usage.day_start < ? THEN ? ELSE tenant_usage.day_start END
+ RETURNING conversations_today
+ """)) {
+ long minuteStart = Instant.now().truncatedTo(ChronoUnit.MINUTES).toEpochMilli();
+ String costMonth = YearMonth.now(ZoneOffset.UTC).toString();
+ ps.setString(1, tenantId);
+ ps.setLong(2, dayStartMs);
+ ps.setLong(3, minuteStart);
+ ps.setString(4, costMonth);
+ ps.setLong(5, dayStartMs);
+ ps.setLong(6, dayStartMs);
+ ps.setLong(7, dayStartMs);
+ try (ResultSet rs = ps.executeQuery()) {
+ if (rs.next() && rs.getInt(1) <= limit) {
+ return QuotaCheckResult.OK;
+ }
+ }
+ }
+ } catch (SQLException e) {
+ LOGGER.errorf("Failed to increment conversations for tenant '%s': %s", sanitize(tenantId), sanitize(e.getMessage()));
+ }
+
+ return QuotaCheckResult.denied("Daily conversation limit reached (" + limit + ")");
+ }
+
+ @Override
+ public QuotaCheckResult tryIncrementApiCalls(String tenantId, int limit) {
+ if (limit < 0) {
+ return QuotaCheckResult.OK;
+ }
+
+ long minuteStart = Instant.now().truncatedTo(ChronoUnit.MINUTES).toEpochMilli();
+
+ ensureSchema();
+ try (Connection conn = dataSourceInstance.get().getConnection()) {
+ try (PreparedStatement ps = conn.prepareStatement(
+ """
+ UPDATE tenant_usage SET api_calls_this_minute = api_calls_this_minute + 1
+ WHERE tenant_id = ? AND minute_start = ? AND api_calls_this_minute < ?
+ RETURNING api_calls_this_minute
+ """)) {
+ ps.setString(1, tenantId);
+ ps.setLong(2, minuteStart);
+ ps.setInt(3, limit);
+ try (ResultSet rs = ps.executeQuery()) {
+ if (rs.next()) {
+ return QuotaCheckResult.OK;
+ }
+ }
+ }
+
+ // Window may be stale β reset
+ try (PreparedStatement ps = conn.prepareStatement(
+ """
+ INSERT INTO tenant_usage (tenant_id, conversations_today, day_start, api_calls_this_minute, minute_start, monthly_cost_usd, cost_month)
+ VALUES (?, 0, ?, 1, ?, 0.0, ?)
+ ON CONFLICT (tenant_id) DO UPDATE SET
+ api_calls_this_minute = CASE WHEN tenant_usage.minute_start < ? THEN 1 ELSE tenant_usage.api_calls_this_minute END,
+ minute_start = CASE WHEN tenant_usage.minute_start < ? THEN ? ELSE tenant_usage.minute_start END
+ RETURNING api_calls_this_minute
+ """)) {
+ long dayStart = Instant.now().truncatedTo(ChronoUnit.DAYS).toEpochMilli();
+ String costMonth = YearMonth.now(ZoneOffset.UTC).toString();
+ ps.setString(1, tenantId);
+ ps.setLong(2, dayStart);
+ ps.setLong(3, minuteStart);
+ ps.setString(4, costMonth);
+ ps.setLong(5, minuteStart);
+ ps.setLong(6, minuteStart);
+ ps.setLong(7, minuteStart);
+ try (ResultSet rs = ps.executeQuery()) {
+ if (rs.next() && rs.getInt(1) <= limit) {
+ return QuotaCheckResult.OK;
+ }
+ }
+ }
+ } catch (SQLException e) {
+ LOGGER.errorf("Failed to increment API calls for tenant '%s': %s", sanitize(tenantId), sanitize(e.getMessage()));
+ }
+
+ return QuotaCheckResult.denied("API rate limit reached (" + limit + "/min)");
+ }
+
+ @Override
+ public QuotaCheckResult tryAddCost(String tenantId, double cost, double limit) {
+ String monthKey = YearMonth.now(ZoneOffset.UTC).toString();
+
+ ensureSchema();
+ try (Connection conn = dataSourceInstance.get().getConnection();
+ PreparedStatement ps = conn.prepareStatement(
+ """
+ INSERT INTO tenant_usage (tenant_id, conversations_today, day_start, api_calls_this_minute, minute_start, monthly_cost_usd, cost_month)
+ VALUES (?, 0, ?, 0, ?, ?, ?)
+ ON CONFLICT (tenant_id) DO UPDATE SET
+ monthly_cost_usd = CASE WHEN tenant_usage.cost_month = ? THEN tenant_usage.monthly_cost_usd + ? ELSE ? END,
+ cost_month = ?
+ RETURNING monthly_cost_usd
+ """)) {
+ long now = Instant.now().toEpochMilli();
+ ps.setString(1, tenantId);
+ ps.setLong(2, now);
+ ps.setLong(3, now);
+ ps.setDouble(4, cost);
+ ps.setString(5, monthKey);
+ ps.setString(6, monthKey);
+ ps.setDouble(7, cost);
+ ps.setDouble(8, cost);
+ ps.setString(9, monthKey);
+ try (ResultSet rs = ps.executeQuery()) {
+ if (rs.next()) {
+ double totalCost = rs.getDouble(1);
+ if (limit >= 0 && totalCost > limit) {
+ return QuotaCheckResult.denied(
+ "Monthly cost budget exceeded ($%.2f / $%.2f)".formatted(totalCost, limit));
+ }
+ }
+ }
+ } catch (SQLException e) {
+ LOGGER.errorf("Failed to add cost for tenant '%s': %s", sanitize(tenantId), sanitize(e.getMessage()));
+ // Fail closed β if cost accounting fails, deny the request rather than
+ // silently bypassing budget enforcement
+ return QuotaCheckResult.denied("Cost accounting failed β denying request for safety");
+ }
+ return QuotaCheckResult.OK;
+ }
+
+ // βββ Usage Reporting βββ
+
+ @Override
+ public UsageSnapshot getUsage(String tenantId) {
+ ensureSchema();
+ try (Connection conn = dataSourceInstance.get().getConnection();
+ PreparedStatement ps = conn.prepareStatement(
+ "SELECT * FROM tenant_usage WHERE tenant_id = ?")) {
+ ps.setString(1, tenantId);
+ try (ResultSet rs = ps.executeQuery()) {
+ if (rs.next()) {
+ return toSnapshot(tenantId, rs);
+ }
+ }
+ } catch (SQLException e) {
+ LOGGER.warnf("Failed to read usage for tenant '%s': %s", sanitize(tenantId), sanitize(e.getMessage()));
+ }
+ return UsageSnapshot.empty(tenantId);
+ }
+
+ @Override
+ public double getMonthlyCost(String tenantId) {
+ ensureSchema();
+ try (Connection conn = dataSourceInstance.get().getConnection();
+ PreparedStatement ps = conn.prepareStatement(
+ "SELECT monthly_cost_usd, cost_month FROM tenant_usage WHERE tenant_id = ?")) {
+ ps.setString(1, tenantId);
+ try (ResultSet rs = ps.executeQuery()) {
+ if (rs.next()) {
+ String monthKey = rs.getString("cost_month");
+ if (monthKey != null && monthKey.equals(YearMonth.now(ZoneOffset.UTC).toString())) {
+ return rs.getDouble("monthly_cost_usd");
+ }
+ }
+ }
+ } catch (SQLException e) {
+ LOGGER.warnf("Failed to read monthly cost for tenant '%s': %s", sanitize(tenantId), sanitize(e.getMessage()));
+ }
+ return 0.0;
+ }
+
+ @Override
+ public void resetUsage(String tenantId) {
+ ensureSchema();
+ try (Connection conn = dataSourceInstance.get().getConnection();
+ PreparedStatement ps = conn.prepareStatement("DELETE FROM tenant_usage WHERE tenant_id = ?")) {
+ ps.setString(1, tenantId);
+ ps.executeUpdate();
+ LOGGER.infof("Reset usage counters for tenant '%s'", sanitize(tenantId));
+ } catch (SQLException e) {
+ LOGGER.errorf("Failed to reset usage for tenant '%s': %s", sanitize(tenantId), sanitize(e.getMessage()));
+ }
+ }
+
+ // βββ Mapping βββ
+
+ private TenantQuota toQuota(ResultSet rs) throws SQLException {
+ return new TenantQuota(
+ rs.getString("tenant_id"),
+ rs.getInt("max_conversations_per_day"),
+ rs.getInt("max_agents_per_tenant"),
+ rs.getInt("max_api_calls_per_minute"),
+ rs.getDouble("max_monthly_cost_usd"),
+ rs.getBoolean("enabled"));
+ }
+
+ private UsageSnapshot toSnapshot(String tenantId, ResultSet rs) throws SQLException {
+ return new UsageSnapshot(
+ tenantId,
+ rs.getInt("conversations_today"),
+ rs.getInt("api_calls_this_minute"),
+ rs.getDouble("monthly_cost_usd"),
+ Instant.ofEpochMilli(rs.getLong("minute_start")),
+ Instant.ofEpochMilli(rs.getLong("day_start")),
+ rs.getString("cost_month") != null
+ ? YearMonth.parse(rs.getString("cost_month"))
+ : YearMonth.now(ZoneOffset.UTC));
+ }
+}
diff --git a/src/main/java/ai/labs/eddi/modules/llm/impl/AgentOrchestrator.java b/src/main/java/ai/labs/eddi/modules/llm/impl/AgentOrchestrator.java
index 5ad886d5f..f44cac3ba 100644
--- a/src/main/java/ai/labs/eddi/modules/llm/impl/AgentOrchestrator.java
+++ b/src/main/java/ai/labs/eddi/modules/llm/impl/AgentOrchestrator.java
@@ -41,6 +41,8 @@
import static ai.labs.eddi.utils.LogSanitizer.sanitize;
import ai.labs.eddi.engine.tenancy.TenantQuotaService;
+import com.fasterxml.jackson.databind.JsonNode;
+import com.fasterxml.jackson.databind.ObjectMapper;
import java.io.IOException;
import java.util.*;
@@ -212,6 +214,14 @@ private ExecutionResult executeWithTools(ChatModel chatModel, String systemMessa
}
}
+ // --- LAZY mode: separate built-in specs from external specs ---
+ // In LAZY mode, all built-in tool executors are registered (so they CAN be
+ // called), but initially only discover_tools spec is presented to the LLM.
+ // After the LLM calls discover_tools, we parse the result and activate the
+ // matching built-in specs for subsequent iterations.
+ boolean isLazy = task.getToolLoadingStrategy() == LlmConfiguration.ToolLoadingStrategy.LAZY;
+ List builtInSpecs = new ArrayList<>(toolSpecs); // copy before merging external
+
// Merge httpcall tools discovered from workflow (if any)
if (httpCallTools != null && !httpCallTools.toolSpecs().isEmpty()) {
toolSpecs.addAll(httpCallTools.toolSpecs());
@@ -230,6 +240,36 @@ private ExecutionResult executeWithTools(ChatModel chatModel, String systemMessa
toolExecutors.putAll(a2aTools.executors());
}
+ // Active specs: what the LLM currently sees
+ List activeSpecs;
+ if (isLazy) {
+ // Start with only discover_tools + all external tools (HTTP/MCP/A2A)
+ activeSpecs = new ArrayList<>();
+ for (ToolSpecification spec : builtInSpecs) {
+ if ("discover_tools".equals(spec.name())) {
+ activeSpecs.add(spec);
+ }
+ }
+ // Add external tool specs (always visible regardless of strategy)
+ int externalCount = 0;
+ if (httpCallTools != null) {
+ activeSpecs.addAll(httpCallTools.toolSpecs());
+ externalCount += httpCallTools.toolSpecs().size();
+ }
+ if (mcpCallWorkflowTools != null) {
+ activeSpecs.addAll(mcpCallWorkflowTools.toolSpecs());
+ externalCount += mcpCallWorkflowTools.toolSpecs().size();
+ }
+ if (a2aTools != null) {
+ activeSpecs.addAll(a2aTools.toolSpecs());
+ externalCount += a2aTools.toolSpecs().size();
+ }
+ LOGGER.infof("LAZY mode: presenting %d specs initially (discover_tools + %d external)",
+ activeSpecs.size(), externalCount);
+ } else {
+ activeSpecs = toolSpecs;
+ }
+
// Build message list with system message if provided
List messages = new ArrayList<>();
if (!isNullOrEmpty(systemMessage)) {
@@ -269,8 +309,8 @@ private ExecutionResult executeWithTools(ChatModel chatModel, String systemMessa
for (int i = 0; i < maxIterations; i++) {
ChatRequest.Builder requestBuilder = ChatRequest.builder().messages(currentMessages);
- if (!toolSpecs.isEmpty()) {
- requestBuilder.toolSpecifications(toolSpecs);
+ if (!activeSpecs.isEmpty()) {
+ requestBuilder.toolSpecifications(activeSpecs);
}
ChatRequest chatRequest = requestBuilder.build();
@@ -359,6 +399,11 @@ private ExecutionResult executeWithTools(ChatModel chatModel, String systemMessa
trace.add(resultStep);
currentMessages.add(ToolExecutionResultMessage.from(toolRequest, toolResult));
+
+ // LAZY mode: after discover_tools returns, activate the matching built-in specs
+ if (isLazy && "discover_tools".equals(toolRequest.name())) {
+ activateDiscoveredTools(toolResult, builtInSpecs, activeSpecs);
+ }
}
} else {
return aiMessage.text();
@@ -372,8 +417,58 @@ private ExecutionResult executeWithTools(ChatModel chatModel, String systemMessa
return new ExecutionResult(response, trace);
}
+ /**
+ * Parses the discover_tools JSON result and activates matching built-in tool
+ * specs so the LLM can call them on subsequent iterations.
+ */
+ private void activateDiscoveredTools(String discoverResult,
+ List builtInSpecs,
+ List activeSpecs) {
+ try {
+ ObjectMapper mapper = new ObjectMapper();
+ JsonNode root = mapper.readTree(discoverResult);
+ JsonNode toolsNode = root.get("tools");
+ if (toolsNode == null || !toolsNode.isArray()) {
+ return;
+ }
+
+ Set discoveredNames = new HashSet<>();
+ for (JsonNode tool : toolsNode) {
+ if (tool.has("name")) {
+ discoveredNames.add(tool.get("name").asText());
+ }
+ }
+
+ // Add matching specs (skip discover_tools itself and already-active specs)
+ Set activeNames = new HashSet<>();
+ for (ToolSpecification spec : activeSpecs) {
+ activeNames.add(spec.name());
+ }
+
+ int activated = 0;
+ for (ToolSpecification spec : builtInSpecs) {
+ if (discoveredNames.contains(spec.name()) && !activeNames.contains(spec.name())) {
+ activeSpecs.add(spec);
+ activated++;
+ }
+ }
+
+ LOGGER.infof("LAZY activation: %d tools activated from discovery (%s)",
+ activated, discoveredNames);
+ } catch (Exception e) {
+ LOGGER.warnf("Failed to parse discover_tools result for LAZY activation: %s",
+ e.getMessage());
+ }
+ }
+
/**
* Collects enabled built-in tools based on task configuration.
+ *
+ * When {@link LlmConfiguration.ToolLoadingStrategy#LAZY} is set, ALL tools are
+ * returned (so executors get registered), plus a {@link DiscoverToolsTool}
+ * meta-tool. The {@code executeWithTools} method handles presenting only
+ * {@code discover_tools} spec initially and activating matching specs after
+ * discovery.
*/
List