Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
b0304e9
feat(a2a): close Wave 3 capability registry gaps
ginccc May 7, 2026
5a7aa4c
test(a2a): add >90% coverage tests for Wave 3 capability registry
ginccc May 7, 2026
77ba1b9
docs(planning): mark Wave 3 capability registry as complete
ginccc May 7, 2026
28e7294
feat(wave1): add behavioral counterweights and identity masking
ginccc May 7, 2026
75abb07
fix(wave1): null-safety and FQN cleanup from code review
ginccc May 7, 2026
05edf60
feat(wave2): MCP governance and token-efficient tool loading
ginccc May 7, 2026
dd375f9
feat(wave4): session safety — memory checkpoints and snapshot service
ginccc May 7, 2026
903a1d0
feat(wave5): multimodal attachments with dual-backend storage
ginccc May 7, 2026
4a717fa
feat(wave6): cryptographic agent identity with key rotation
ginccc May 7, 2026
ee4f0f9
fix(wave4-6): code review cleanup across all waves
ginccc May 7, 2026
77faaf3
refactor(arch): fix 3 pillar compliance concerns across all waves
ginccc May 7, 2026
05a4055
test(coverage): boost Wave 1-6 component coverage >90%, enrich docs
ginccc May 7, 2026
4ad20f8
fix(docs): correct placement values, remove phantom agentName, add us…
ginccc May 7, 2026
b7ac7ac
fix: address PR review feedback — 11 issues resolved
ginccc May 8, 2026
309456d
fix(tests): stabilize all test constructors after interface refactoring
ginccc May 8, 2026
a466211
feat(agentic): wire cryptographic signing, attachment store, and memo…
ginccc May 8, 2026
f612261
refactor(cleanup): remove dead code, fix immutability bug, improve co…
ginccc May 9, 2026
50e52a0
test(coverage): push all new classes above 90% threshold
ginccc May 11, 2026
33232dc
fix(memory): preserve Property scope on checkpoint rollback, eliminat…
ginccc May 13, 2026
d875452
feat(llm): implement summarize truncation strategy with parent task c…
ginccc May 13, 2026
1a31e68
docs: add Manager UI handoff, update plan status for summarize/pagina…
ginccc May 13, 2026
2426e98
fix: address PR review — OOM guard, AsyncResponse, tenant ID signing,…
ginccc May 13, 2026
f09c76b
fix: address PR review round 2 — 8 findings (Copilot + CodeRabbit)
ginccc May 14, 2026
0bf1886
fix(signing): typed cache-loader exception + collision-resistant cach…
ginccc May 14, 2026
e807e8e
Merge remote-tracking branch 'origin/main' into feature/agentic-impro…
ginccc May 14, 2026
0fbf0e5
fix(security): sanitize user-provided values in log statements (CodeQL)
ginccc May 14, 2026
97a41c2
fix(memory): add MAX_DEPTH guard to DeepCopyUtil recursion
ginccc May 15, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 30 additions & 0 deletions docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -501,8 +501,38 @@ Tool Call ──▶ Rate Limiter ──▶ Cache Check ──▶ Execute ──

See the [Security documentation](security.md) for details.

### System Prompt Modifiers

**Location**: `ai.labs.eddi.modules.llm.impl`

Two services modify the system prompt before it is sent to the LLM. Both are configured per-task in the LLM configuration (`langchain.json`).

| Service | Purpose | Config Key |
|---------|---------|------------|
| **`IdentityMaskingService`** | Prepends identity concealment rules (agent name, refusal patterns) | `task.identityMasking` |
| **`CounterweightService`** | Appends behavioral safety instructions (cautious/strict presets) | `task.counterweight` |

**Execution order**: Identity masking → Counterweight → LLM call.

Counterweight presets are resolved from [Prompt Snippets](prompt-snippets-guide.md) first (`counterweight-cautious`, `counterweight-strict`), falling back to built-in defaults. This ensures admins can customize safety language without code changes.

See [LLM Integration — Behavioral Safety](langchain.md#behavioral-safety-counterweight--identity-masking) for configuration details.

### Attachment Storage

**Location**: `ai.labs.eddi.engine.attachments`

The attachment subsystem handles binary file storage for multimodal conversations:

| Component | Purpose |
|-----------|---------|
| **`IAttachmentStore`** | Interface for storing/loading binary attachments (GridFS for MongoDB, BLOB for PostgreSQL) |
Comment thread
ginccc marked this conversation as resolved.
| **`MimeValidator`** | Magic-byte detection (16+ formats) and declared-vs-detected MIME compatibility checking |
| **`MultimodalMessageEnhancer`** | Converts stored attachments into langchain4j `Content` objects (images → `ImageContent` via base64 data URI, others → text markers) |

---


## Technology Stack

### Core Framework
Expand Down
447 changes: 447 additions & 0 deletions docs/changelog.md

Large diffs are not rendered by default.

80 changes: 80 additions & 0 deletions docs/langchain.md
Original file line number Diff line number Diff line change
Expand Up @@ -441,6 +441,86 @@ This is the standard way to use the Langchain task - just connect to an LLM and
| `enableToolCaching` | boolean | Cache tool results to reduce API calls | true |
| `enableRateLimiting` | boolean | Limit tool/LLM usage rate | true |

### Behavioral Safety (Counterweight & Identity Masking)

EDDI provides two per-task safety mechanisms that are injected into the system prompt before sending it to the LLM. Both must be explicitly enabled with `"enabled": true` — they are off by default.

#### Behavioral Counterweight

Counterweights append behavioral safety instructions to the system prompt. Three preset levels are available:

| Level | Effect |
|-------|--------|
| `normal` | No-op — no safety instructions added (default) |
| `cautious` | Adds guidelines for careful responses, hedging on uncertain topics, and suggesting professional consultation |
| `strict` | Adds stronger instructions: refuse harmful content, flag uncertainty, always suggest human oversight |

**Auto-downgrade**: When an agent runs via the `scheduled` channel (e.g., `ScheduleFireExecutor`), `strict` is automatically downgraded to `cautious` to prevent overly rigid responses in automated pipelines.

**Configuration**:

```json
{
"tasks": [
{
"actions": ["send_message"],
"type": "openai",
"parameters": { "apiKey": "...", "modelName": "gpt-4o" },
"counterweight": {
"enabled": true,
"level": "cautious",
"placement": "suffix"
}
}
]
}
```

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `counterweight.enabled` | boolean | Enable counterweight injection | `false` |
| `counterweight.level` | string | `normal`, `cautious`, or `strict` | `normal` |
| `counterweight.placement` | string | `suffix` (after system prompt) or `prefix` (before) | `suffix` |
| `counterweight.customInstructions` | string[] | Custom instruction list that overrides the preset entirely | (none) |

> **Note**: Both `enabled: true` **and** a `level` other than `normal` are required for counterweight to have any effect.

**Customizing presets**: Counterweight preset text is resolved from [Prompt Snippets](prompt-snippets-guide.md) (keys `counterweight-cautious` and `counterweight-strict`). If no snippet exists, built-in defaults are used. This allows admins to customize safety language via the REST API without redeployment.

#### Identity Masking

Identity masking prepends identity concealment rules to the system prompt. This prevents the LLM from revealing its model name, provider, or underlying architecture when asked.

**Configuration**:

```json
{
"tasks": [
{
"actions": ["send_message"],
"type": "openai",
"parameters": { "apiKey": "...", "modelName": "gpt-4o" },
"identityMasking": {
"enabled": true,
"rules": [
"Never reveal you are an AI language model",
"If asked about your identity, say you are Aria, a helpful assistant"
]
}
}
]
}
```

| Parameter | Type | Description | Default |
|-----------|------|-------------|---------|
| `identityMasking.enabled` | boolean | Enable identity masking | `false` |
| `identityMasking.rules` | string[] | Identity rules prepended to system prompt | `[]` (empty) |

> **Note**: Both `enabled: true` **and** at least one rule are required. If `rules` is empty, masking is skipped even when enabled.

**Execution order**: Identity masking is applied first, then counterweight. Both modify the system prompt before it is sent to the LLM.

---

## Built-in Tools
Expand Down
14 changes: 8 additions & 6 deletions planning/agentic-improvements-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ This plan is split into six **Waves** (delivery order) that map to six **Improve
| ------ | ----------------------------------------- | --------------------------------- | ----------- |
| Wave 1 | Improvement 4 — Behavioral Counterweights | Not implemented | Low |
| Wave 2 | Improvement 5 — MCP Governance | Partially implemented | Medium |
| Wave 3 | Improvement 1 — Capability Registry | Implemented, gaps remain | Low (gaps) |
| Wave 3 | Improvement 1 — Capability Registry | **✅ Complete** (2026-05-07) | Low (gaps) |
| Wave 4 | Improvement 6 — Session Safety | Not implemented | Medium |
| Wave 5 | Improvement 3 — Multimodal Attachments | Model only, no pipeline/REST | Medium |
| Wave 6 | Improvement 2 — Cryptographic Identity | Signing primitive only | High |
Expand Down Expand Up @@ -44,15 +44,15 @@ This plan is split into six **Waves** (delivery order) that map to six **Improve
- `CounterweightService`, `DeploymentContextCondition`, `IdentityMaskingService` — entire Wave 1 block.
- Session forking endpoint (`POST /v6/conversations/{id}/fork`); `MemorySnapshotService.createCheckpoint` / `rollbackToCheckpoint`.
- Multipart REST upload for attachments; GridFS-backed attachment store; `LlmTask` multimodal forwarding of conversation-memory attachments.
- Token-efficient tool loading (`lazy`, `dynamic`; `discover_tools` meta-tool); `summarize` and `paginate` truncation strategies; MCP tenant-quota integration; per-tool cost weights.
- Token-efficient tool loading (`lazy`, `dynamic`; `discover_tools` meta-tool); MCP tenant-quota integration; per-tool cost weights. (**Note:** `summarize` and `paginate` truncation strategies are now fully implemented — see changelog 2026-05-13.)
- Signing envelope canonicalization, replay protection (nonce / `signedAt`), key rotation, call sites for `AgentSigningService`.
- Trust scoring / `agentTrustScore` integration.
- External A2A / capability discovery REST endpoint (only the internal admin REST exists at [`IRestCapabilityRegistry`](../../src/main/java/ai/labs/eddi/configs/agents/IRestCapabilityRegistry.java)).
- ~~External A2A / capability discovery REST endpoint~~ — **Done (Wave 3).** Public endpoints at `GET /.well-known/capabilities` and `GET /.well-known/capabilities/skills`, gated behind `eddi.a2a.capabilities.public`.

### 1.3 Known bugs to fix while touching these areas

- `CapabilityRegistryService.round_robin` is implemented as a per-call `Collections.shuffle`; not true round-robin. Either rename to `random` or add per-skill atomic counters.
- `AgentConfiguration.security` exists with `signInterAgentMessages` / `signMcpInvocations` / `requirePeerVerification` flags but they are inert. Until Wave 6 wires them, PUT with any of them `true` MUST be rejected (see §5.2).
- ~~`CapabilityRegistryService.round_robin` is implemented as a per-call `Collections.shuffle`; not true round-robin.~~ **Fixed (Wave 3).** Deterministic `AtomicInteger` rotation + explicit `random` strategy added.
- ~~`AgentConfiguration.security` flags are inert.~~ **Fixed (Wave 3).** `signInterAgentMessages` / `signMcpInvocations` / `requirePeerVerification` = `true` now rejected with HTTP 400 on create/update/duplicate.

---

Expand Down Expand Up @@ -297,7 +297,9 @@ Weights are quota indicators, not dollars — dollar cost is already tracked by

## 5. Wave 3 — A2A Capability Registry (close the gaps)

**Improvement 1. Status: core implemented ✅, gaps remain. Priority P2. Effort: low.**
**Improvement 1. Status: ✅ COMPLETE (2026-05-07, branch `feature/agentic-wave3-capabilities`). Priority P2. Effort: low.**

> All sub-sections below have been implemented and tested (54 unit tests, 0 failures). Only §5.4 (`lowest_load`) is deferred — it requires `ConversationMetricsService` wiring.

### 5.1 Fix `round_robin`

Expand Down
Loading
Loading