Feat/group task orchestration#572
Conversation
…orchestration Add a new TASK_FORCE discussion style that enables agent groups to collaboratively accomplish tasks instead of debating. The pipeline follows PLAN -> EXECUTE -> VERIFY -> SYNTHESIS phases. - SharedTaskList: task model with status machine and dependency tracking - TaskListParser: three-tier LLM output parsing (JSON/Markdown/fallback) - GroupConversationService: task phase orchestration with parallel execution - DiscussionStylePresets: TASK_FORCE expansion and PLAN/EXECUTE/VERIFY templates - SSE events: TaskPlanCreated, TaskVerified for real-time streaming - MCP/Slack: TASK_FORCE support in tools and event formatting - HITL foundation: AWAITING_APPROVAL state for future Phase 9b 49 new/updated unit tests, all passing.
…error handling CRITICAL: SharedTaskList synchronized, ConcurrentHashMap for memberConversationIds, dependsOn resolution, null agentId guard. HIGH: transcript snapshot for parallel EXECUTE, timeout semantics (maxTasksPerAgent), round-robin assignment for ALL role, dedicated verification JSON parser, IllegalStateException catch, handleTaskFailure error events. MEDIUM: Slack EXPANDED_STYLES, HashSet cycle detection, singleTaskFallback output, HITL placeholder Javadoc. Docs: 7 files updated (5→6 styles). Tests: +18 (89 total, 0 failures).
…ensive setter C1-final: addTask→updateTask for dependency resolution (prevented duplicate TaskItems with same ID silently breaking dependency ordering). M1-final: setMemberConversationIds defensively wraps in ConcurrentHashMap (MongoDB deserialization was replacing it with plain LinkedHashMap). Dead code: removed unused snapshotTranscript from executeTaskExecutionPhase. New: SharedTaskList.updateTask() public method for in-place replacement. Regression tests (+20): GroupConversationServiceTaskForceTest (18 tests covering resolveTaskAssignment, tryParseVerificationJson, handleTaskFailure, setMemberConversationIds), SharedTaskListTest (+3 updateTask tests), McpGroupToolsTest (+1 TASK_FORCE assertion). Total: 109 tests, 0 failures.
QuotaExceededException from ConversationService was being silently caught and treated as a per-agent skip/retry. Now it is detected at 4 levels: 1. executeAgentTurn → startConversation: immediate GroupDiscussionException 2. executeAgentTurn → say(): unwrap from ExecutionException, abort (no retry) 3. Task execution loop: quota error exits the agent's CompletableFuture 4. Parallel phase: quota propagates through CompletionException, cancels remaining futures Prevents burning N round-trips when quota is already exhausted. Tests: +3 regression tests (startConversation quota, say() quota, no-retry). Total: 112 tests, 0 failures.
Review finding: with SKIP policy (the default), quota errors in the task execution loop were collected in the errors list but only checked if policy was ABORT. Now quota errors are checked unconditionally before the policy-gated check. 112 tests, 0 failures.
…time - DynamicAgentConfig: config-driven guardrails for agent creation, recruitment, delegation with provider/model whitelists and caps - GroupConversation: dynamicMembers, createdAgentIds, retainedAgentIds - 4 LLM tools: CreateSubAgentTool, ConverseWithAgentTool, FindAgentsByCapabilityTool, TeardownAgentTool - AgentOrchestrator: whitelist-gated tool registration with addDynamicAgentTools helper - GroupConversationService: findMemberIncludingDynamic, cleanupEphemeralAgents with lifecycle policy enforcement - 22 new tests in DynamicAgentToolsTest - Updated 11 existing test files for new constructor signatures - Fixed enum count assertions in AgentGroupConfigurationTest
…ts, retain wiring, quota, thread safety - C1: Share createdAgentIds/retainedAgentIds across all dynamic tools in AgentOrchestrator - C2: Wire retain flag in CreateSubAgentTool to retainedAgentIds set - C3: Remove double quota counting (acquireConversationSlot removed from tool) - C4: Make GroupConversation.transcript thread-safe (Collections.synchronizedList) + null-safe setter - C5: Fix dead ERROR detection — extractResponse returns null not empty string - C6: Add 8 ConverseWithAgentTool tests (was zero coverage) - M1: LifecyclePolicy String→enum with JsonValue/JsonCreator - M2: Synchronized block for dynamicMembers streaming in findMemberIncludingDynamic - M3: Call detectCycles() after task list build — fail-fast on circular deps - M5: Add unretainAgent() tool method - M6: Remove agent from createdAgentIds after teardown - M10: Case-insensitive provider/model guardrail checks 9,486 tests pass, 0 failures
…rdering, cycle detection scope - F1: Delete dead addDynamicAgentTools method (37 lines, replaced by inline block) - F2: Move createdAgentIds.remove() before delete branch in TeardownAgentTool (was skipped on delete=true early return, leaving counter stale) - F3: Move cycle detection after both pre-configured and LLM planning paths (was only in pre-configured block — LLM-generated deps were unchecked) - F4: Remove dead tenantQuotaService from CreateSubAgentTool (field, param, imports) + all caller/test updates — quota handled by startConversation() 9,486 tests pass, 0 failures
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
🚧 Files skipped from review as they are similar to previous changes (4)
📝 WalkthroughWalkthroughThe PR adds ChangesTask-force runtime and dynamic agents
Estimated code review effort🎯 5 (Critical) | ⏱️ ~90+ minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
| * group members for assignment resolution | ||
| * @return non-empty list of parsed tasks | ||
| */ | ||
| public static List<ParsedTask> parse(String llmOutput, List<GroupMember> members) { |
| * VERIFY phase: The moderator reviews all completed tasks and provides | ||
| * pass/fail assessments. Results are parsed and applied to the task list. | ||
| */ | ||
| private void executeTaskVerificationPhase(GroupConversation gc, AgentGroupConfiguration config, List<GroupMember> speakers, |
| private void executeTaskVerificationPhase(GroupConversation gc, AgentGroupConfiguration config, List<GroupMember> speakers, | ||
| DiscussionPhase phase, ProtocolConfig protocol, String question, int phaseIdx, | ||
| GroupDiscussionEventListener listener, java.util.concurrent.atomic.AtomicInteger turnCounter, | ||
| int maxTurns) |
| * agent's pipeline. Tasks for different agents execute in parallel; tasks for | ||
| * the same agent execute sequentially within a single CompletableFuture. | ||
| */ | ||
| private void executeTaskExecutionPhase(GroupConversation gc, AgentGroupConfiguration config, List<GroupMember> speakers, |
| */ | ||
| private void executeTaskPlanPhase(GroupConversation gc, AgentGroupConfiguration config, List<GroupMember> speakers, | ||
| DiscussionPhase phase, ProtocolConfig protocol, String question, int phaseIdx, | ||
| GroupDiscussionEventListener listener, java.util.concurrent.atomic.AtomicInteger turnCounter, int maxTurns) |
| : Collections.synchronizedList(new ArrayList<>()); | ||
| } | ||
|
|
||
| public Set<String> getRetainedAgentIds() { |
| dynamicMembers.add(member); | ||
| } | ||
|
|
||
| public List<String> getCreatedAgentIds() { |
There was a problem hiding this comment.
Pull request overview
Adds a new TASK_FORCE group discussion style (PLAN → EXECUTE → VERIFY → SYNTHESIS), along with supporting infrastructure for task planning/parsing, parallel execution, verification events, and a new set of dynamic-agent LLM tools (create/converse/find/teardown). It also hardens tenant quota enforcement so quota exhaustion aborts group runs deterministically instead of being swallowed by skip/retry policies.
Changes:
- Introduces TASK_FORCE orchestration primitives:
SharedTaskList,TaskListParser, new phase templates, and task-plan/verification SSE events. - Adds dynamic-agent LLM tools and wires them into
LlmTask/AgentOrchestrator, plus Slack + MCP + REST/SSE surfacing. - Expands/updates test coverage for the new orchestration and tool behavior, and updates docs to reflect 6 built-in styles.
Reviewed changes
Copilot reviewed 44 out of 44 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/main/java/ai/labs/eddi/engine/api/IGroupConversationService.java | Adds callbacks for task plan + task verification events. |
| src/main/java/ai/labs/eddi/engine/internal/GroupConversationService.java | Implements TASK_FORCE phase routing, task execution/verification, quota propagation, and ephemeral-agent cleanup. |
| src/main/java/ai/labs/eddi/engine/internal/RestGroupConversation.java | Streams new TASK_FORCE events over SSE. |
| src/main/java/ai/labs/eddi/engine/internal/TaskListParser.java | Parses LLM task plans (JSON/markdown/fallback). |
| src/main/java/ai/labs/eddi/engine/lifecycle/GroupConversationEventSink.java | Defines event names + payload records for task plan + verification. |
| src/main/java/ai/labs/eddi/engine/mcp/McpGroupTools.java | Documents TASK_FORCE and adds optional pre-configured tasks to create_group. |
| src/main/java/ai/labs/eddi/configs/groups/model/AgentGroupConfiguration.java | Adds TASK_FORCE enums, task definitions, and dynamic-agent config/lifecycle policy. |
| src/main/java/ai/labs/eddi/configs/groups/model/DiscussionStylePresets.java | Adds TASK_FORCE preset phases and default PLAN/EXECUTE/VERIFY templates. |
| src/main/java/ai/labs/eddi/configs/groups/model/GroupConversation.java | Adds task list + dynamic member/created/retained tracking fields; expands enums. |
| src/main/java/ai/labs/eddi/configs/groups/model/SharedTaskList.java | Implements task state machine, dependency queries, and cycle detection. |
| src/main/java/ai/labs/eddi/configs/groups/rest/RestAgentGroupStore.java | Adds TASK_FORCE style description. |
| src/main/java/ai/labs/eddi/integrations/slack/SlackGroupDiscussionListener.java | Adds Slack rendering for task plan + verification and includes TASK_FORCE in expanded styles. |
| src/main/java/ai/labs/eddi/modules/llm/impl/AgentOrchestrator.java | Wires dynamic-agent tools into whitelist-gated built-in tool collection. |
| src/main/java/ai/labs/eddi/modules/llm/impl/LlmTask.java | Adds new CDI dependencies and passes them into AgentOrchestrator. |
| src/main/java/ai/labs/eddi/modules/llm/tools/CreateSubAgentTool.java | New tool to create/deploy sub-agents with provider/model guardrails and optional bootstrap message. |
| src/main/java/ai/labs/eddi/modules/llm/tools/ConverseWithAgentTool.java | New tool to message another agent (single- or multi-turn). |
| src/main/java/ai/labs/eddi/modules/llm/tools/FindAgentsByCapabilityTool.java | New tool to discover agents by skill via CapabilityRegistryService. |
| src/main/java/ai/labs/eddi/modules/llm/tools/TeardownAgentTool.java | New tool to undeploy/delete created agents and manage retention flags. |
| src/test/java/ai/labs/eddi/configs/groups/model/AgentGroupConfigurationTest.java | Updates enum assertions for new styles/phases/scopes. |
| src/test/java/ai/labs/eddi/configs/groups/model/DiscussionStylePresetsTest.java | Adds TASK_FORCE preset validation tests. |
| src/test/java/ai/labs/eddi/configs/groups/model/GroupConversationTest.java | Updates enum count assertions for new transcript/state enums. |
| src/test/java/ai/labs/eddi/configs/groups/model/SharedTaskListTest.java | New tests for task lifecycle, deps, cycles, and concurrency. |
| src/test/java/ai/labs/eddi/engine/internal/GroupConversationServiceTaskForceTest.java | New regression tests for TASK_FORCE fixes and quota propagation. |
| src/test/java/ai/labs/eddi/engine/internal/TaskListParserTest.java | New tests for plan parsing + agent resolution + fallbacks. |
| src/test/java/ai/labs/eddi/engine/mcp/McpGroupToolsTest.java | Updates tests for TASK_FORCE and new create_group signature. |
| src/test/java/ai/labs/eddi/modules/llm/impl/AgentOrchestratorTest.java | Updates constructor usage for new dependencies. |
| src/test/java/ai/labs/eddi/modules/llm/impl/AgentOrchestratorExtendedTest.java | Updates constructor usage for new dependencies. |
| src/test/java/ai/labs/eddi/modules/llm/impl/AgentOrchestratorExtendedBranchTest.java | Updates constructor usage for new dependencies. |
| src/test/java/ai/labs/eddi/modules/llm/impl/AgentOrchestratorBranchTest.java | Updates constructor usage for new dependencies. |
| src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskTest.java | Updates constructor usage for new dependencies. |
| src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskExtendedTest.java | Updates constructor usage for new dependencies. |
| src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskExtendedBranchTest.java | Updates constructor usage for new dependencies. |
| src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskDeepBranchTest.java | Updates constructor usage for new dependencies. |
| src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskConfigureTest.java | Updates constructor usage for new dependencies. |
| src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskBranchTest.java | Updates constructor usage for new dependencies. |
| src/test/java/ai/labs/eddi/modules/llm/tools/DynamicAgentToolsTest.java | New tests covering dynamic-agent tools and config defaults/fields. |
| README.md | Updates “5 styles” → “6 styles” marketing copy. |
| HANDOFF.md | Updates architecture handoff notes for 6 styles. |
| AGENTS.md | Updates roadmap line for group conversations (6 styles). |
| docs/architecture.md | Documents TASK_FORCE style at a high level. |
| docs/group-conversations.md | Documents TASK_FORCE style + new phase types/scopes and Slack table updates. |
| docs/slack-integration.md | Adds TASK_FORCE to Slack style table. |
| docs/mcp-server.md | Updates MCP docs from 5 → 6 styles. |
| docs/changelog.md | Adds detailed changelog entries for TASK_FORCE + dynamic agents + quota fixes. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Actionable comments posted: 19
🧹 Nitpick comments (3)
src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskBranchTest.java (1)
106-120: 📐 Maintainability & Code Quality | 🔵 Trivial | 🏗️ Heavy liftCentralize
LlmTasktest construction.This constructor signature is now duplicated across multiple test classes with a long tail of positional
nulls, so every dependency change forces broad mechanical edits. A small test fixture/builder would make these wiring tests much less brittle.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskBranchTest.java` around lines 106 - 120, Centralize the repeated LlmTask test setup instead of constructing it inline in LlmTaskBranchTest with the long positional argument list and trailing nulls. Extract the shared wiring into a reusable test fixture or builder for LlmTask, then have this test use that helper so dependency changes only need one update. Use the LlmTask constructor setup in this test as the reference point and keep the helper responsible for supplying the common mocks and defaults.src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskTest.java (1)
130-146: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winUse explicit mocks for the new
LlmTaskdependencies.The five trailing
nulls remove most of the value of this constructor-wiring test: a reorder of the dynamic-agent parameters still compiles. Passing typed mocks here keeps the test useful as a contract check.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskTest.java` around lines 130 - 146, The LlmTask constructor test is wiring the new dynamic-agent dependencies with trailing nulls, which weakens the contract check and hides parameter reordering mistakes. Update LlmTaskTest to pass explicit typed mocks for the five new constructor arguments instead of null, and initialize those mocks alongside the existing dependencies so the instantiation of LlmTask remains compile-time safe and order-sensitive.src/test/java/ai/labs/eddi/modules/llm/impl/AgentOrchestratorBranchTest.java (1)
95-104: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winReplace the new trailing
nulls with typed mocks.These placeholders let constructor-parameter drift keep compiling, so this suite no longer catches miswired dynamic-agent dependencies. Use dedicated mocks for the five new services instead.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@src/test/java/ai/labs/eddi/modules/llm/impl/AgentOrchestratorBranchTest.java` around lines 95 - 104, The AgentOrchestrator test setup is using trailing null placeholders for the five newly added constructor dependencies, which hides wiring regressions. Update the AgentOrchestrator instantiation in AgentOrchestratorBranchTest to pass typed mocks for those new services instead of null, and initialize them with the rest of the test’s mock fields so constructor-parameter drift is still caught.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/group-conversations.md`:
- Line 241: The markdown table row for the group conversation styles is
malformed because the TASK_FORCE entry is appended to the DELPHI row, causing an
invalid column count. Split the combined row into separate rows in the table
block around the DELPHI and TASK_FORCE entries so each strategy has its own
properly delimited row and renders correctly.
In
`@src/main/java/ai/labs/eddi/configs/groups/model/AgentGroupConfiguration.java`:
- Around line 424-428: The inheritParentModel flag is not being enforced at
runtime, so CreateSubAgentTool.createSubAgent() still forwards nullable
provider/model into SetupAgentRequest even when inheritance is disabled. Update
CreateSubAgentTool.createSubAgent() to check
AgentGroupConfiguration.isInheritParentModel() before propagating parent
model/provider values, and only apply the parent defaults when the flag is true;
otherwise leave them unset so SetupAgentRequest reflects the config correctly.
- Around line 430-435: The setter for lifecyclePolicy in AgentGroupConfiguration
currently allows null to overwrite the documented EPHEMERAL default, which can
change cleanup behavior in GroupConversationService.cleanupEphemeralAgents().
Update setLifecyclePolicy so a null input does not replace the existing default
value, and keep the default EPHEMERAL semantics intact when callers omit the
field or send explicit JSON null.
- Around line 261-263: Reject duplicate task subjects in the preconfigured task
list before storing it in AgentGroupConfiguration.setTasks. Add validation in
setTasks to ensure each TaskDefinition.subject is unique and fail fast on
duplicates so GroupConversationService.executeTaskPlanPhase cannot resolve
dependsOn against the wrong task. Use the TaskDefinition.subject accessor and
the setTasks method as the key locations for the fix.
In `@src/main/java/ai/labs/eddi/configs/groups/model/DiscussionStylePresets.java`:
- Around line 337-342: The TASK_FORCE preset’s EXECUTE phase is configured with
TASK_ONLY, so dependent tasks wait for prerequisites but never receive their
outputs in buildTaskExecutionInput(). Update the TaskForce preset in
DiscussionStylePresets.taskForce() so the Task Planning/Execution/Verification
flow uses TASK_WITH_DEPS for the execution phase, or otherwise align the preset
with the dependencyResults injection behavior. Make sure the DiscussionPhase
constructor call for the EXECUTE step matches the expected scope used by
buildTaskExecutionInput().
In `@src/main/java/ai/labs/eddi/configs/groups/model/SharedTaskList.java`:
- Around line 364-370: The SharedTaskList task accessors are exposing mutable
internal state by returning and storing the caller’s list directly. Update
getTasks() and setTasks() so they defensively copy the list at the boundary in
SharedTaskList, keeping the internal tasks field isolated from external mutation
while preserving the existing null-to-empty behavior.
- Around line 311-323: The `SharedTaskList.updateTask()` method currently
replaces the whole `TaskItem`, which allows callers to mutate execution state
fields instead of metadata only. Change `updateTask()` so it preserves the
existing task’s lifecycle data (`status`, assignee/assignment, results,
timestamps, and similar state) and only copies the metadata fields intended by
the Javadoc, reusing the existing task found by id. Keep the lookup/replace
logic in `SharedTaskList` but restrict updates to the safe metadata subset so
state changes still go through `assignTask()`, `startTask()`, `completeTask()`,
`verifyTask()`, and `failTask()`.
In `@src/main/java/ai/labs/eddi/engine/internal/GroupConversationService.java`:
- Around line 619-628: The fallback assignment in GroupConversationService is
allowing unassigned planned tasks to go to the moderator because it uses
TaskListParser.roundRobinAssign directly instead of the same moderator-excluding
logic used by resolveTaskAssignment. Update the assignment path around
resolveTaskAssignment, TaskListParser.roundRobinAssign, and the task assignment
block so round-robin selection excludes the moderator agent when routing
execution work, and only uses the moderator for planning/synthesis roles.
- Around line 888-910: The fallback in parseAndApplyVerification currently
auto-verifies every COMPLETED task when verification parsing fails, which turns
unreadable verifier output into false success. Update parseAndApplyVerification
(and, if needed, tryParseVerificationJson) so malformed JSON/plain text leaves
tasks unverified or fails the verification phase instead of calling
verifyTask(true) for all tasks. Keep the success path for valid parsed results,
but change the failure path to preserve a non-verified state and emit an
appropriate failure/diagnostic event through the listener if needed.
- Around line 667-670: The EXECUTE scheduling in GroupConversationService
currently uses a one-time snapshot from findExecutableTasks(), so newly
unblocked dependent tasks are never picked up after earlier tasks finish. Update
the EXECUTE flow around tasksByAgent and the worker loop so it repeatedly
re-queries for executable tasks and schedules newly available ones until no more
progress can be made, ensuring dependency chains like task A -> task B continue
within the same phase.
- Around line 724-729: The quota-exit path in GroupConversationService does not
actually abort the other worker futures, so sibling agents keep running after a
QuotaExceededException is caught. Update the handling around the worker loop and
the CompletableFuture.allOf(...).get(...) flow so that a quota failure triggers
immediate cancellation/termination of the remaining tasks, using the existing
GroupDiscussionException and QuotaExceededException checks to locate the abort
logic. Make sure the shared errors list and any future orchestration in the
related block also stop scheduling or continuing work once the first quota error
is observed.
In `@src/main/java/ai/labs/eddi/engine/mcp/McpGroupTools.java`:
- Around line 171-178: Reject pre-configured tasks in create_group() unless
style is TASK_FORCE, since config.getTasks() is only consumed by the TASK_FORCE
workflow and other styles silently ignore it. Update McpGroupTools.createGroup()
to validate the style argument before storing tasks, and return a clear error
when tasks are provided with any non-TASK_FORCE style; apply the same guard
anywhere else in the method where tasks are accepted or passed through.
In `@src/main/java/ai/labs/eddi/modules/llm/impl/AgentOrchestrator.java`:
- Around line 562-588: The dynamic-agent tracking state is being recreated
inside collectAllBuiltInTools(), so CreateSubAgentTool and TeardownAgentTool
lose discussion history on every executeIfToolsEnabled() call. Move
sharedCreatedIds and sharedRetainedIds to a discussion-scoped store such as
IConversationMemory (or equivalent) and have collectAllBuiltInTools() read the
existing state instead of instantiating new collections. Keep the tool classes
stateless, and preserve the same shared state across CreateSubAgentTool,
TeardownAgentTool, and any related dynamic-agent setup.
In `@src/main/java/ai/labs/eddi/modules/llm/tools/CreateSubAgentTool.java`:
- Around line 127-144: CreateSubAgentTool is building a SetupAgentRequest that
always passes a null apiKey, which causes explicit cloud providers to fail in
AgentSetupService.setupAgent before vault-backed credentials can be used. Update
the request construction in CreateSubAgentTool so non-local providers either
receive the needed credentials or are rejected earlier with a clear validation
path, and verify the flow around setupAgent and SetupAgentRequest handles
Anthropic/OpenAI/Gemini consistently.
- Around line 147-149: The retain flag in CreateSubAgentTool is overriding the
configured lifecycle behavior, so the retain path must be constrained by
DynamicAgentConfig.getLifecyclePolicy() instead of blindly adding the agent in
retainedAgentIds. Update the retain handling in CreateSubAgentTool so it only
marks an agent as retained when the active lifecycle policy permits it, and keep
the decision driven by the config rather than the LLM input; apply the same fix
in the other retain path around the later retain check as well.
- Around line 112-123: The model allow-list check in CreateSubAgentTool is
bypassed when provider is missing because the current fallback to "default" can
skip validation against DynamicAgentConfig.allowedModels. Update the guardrail
logic in CreateSubAgentTool so that when model is provided but provider is
omitted, it still resolves the correct provider key or otherwise validates
against the configured allowed models instead of assuming "default"; make the
check fail closed if no matching provider list can be found. Keep the comparison
and rejection behavior in the existing allowed-models branch, but ensure the
lookup uses a real provider identifier rather than silently bypassing
validation.
In `@src/main/java/ai/labs/eddi/modules/llm/tools/TeardownAgentTool.java`:
- Around line 63-87: The retain/teardown checks in
TeardownAgentTool.teardownAgent are racing with concurrent retainAgent or a
second teardown because the state is only checked before undeployAgent runs. Add
an atomic per-agent lifecycle reservation or shared synchronization around the
contains/removal transition so an agent can be marked “tearing down” before
undeployAgent is called, and make retainAgent honor that state to prevent
retain/teardown from interleaving.
- Around line 85-95: The teardown flow is removing the agent from tracking too
early in TeardownAgentTool.decommission logic, which breaks retries after a
failed permanent delete. Move the createdAgentIds.remove(agentId) call so it
only happens after agentStore.deleteAllPermanently(agentId) succeeds, and keep
the agent tracked if the delete throws; use the TeardownAgentTool method
handling the delete path and the createdAgentIds set as the key symbols to
update.
In `@src/test/java/ai/labs/eddi/engine/internal/TaskListParserTest.java`:
- Around line 213-221: The test name and assertions in jsonMissingSubject do not
match the actual behavior of TaskListParser.parse. Update the test to verify the
fallback single-task parsing result by asserting the returned TaskList has the
expected subject and description, using the parse method and the
jsonMissingSubject test case as the reference points; alternatively, rename the
test if it is only meant to check that the result is non-empty.
---
Nitpick comments:
In
`@src/test/java/ai/labs/eddi/modules/llm/impl/AgentOrchestratorBranchTest.java`:
- Around line 95-104: The AgentOrchestrator test setup is using trailing null
placeholders for the five newly added constructor dependencies, which hides
wiring regressions. Update the AgentOrchestrator instantiation in
AgentOrchestratorBranchTest to pass typed mocks for those new services instead
of null, and initialize them with the rest of the test’s mock fields so
constructor-parameter drift is still caught.
In `@src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskBranchTest.java`:
- Around line 106-120: Centralize the repeated LlmTask test setup instead of
constructing it inline in LlmTaskBranchTest with the long positional argument
list and trailing nulls. Extract the shared wiring into a reusable test fixture
or builder for LlmTask, then have this test use that helper so dependency
changes only need one update. Use the LlmTask constructor setup in this test as
the reference point and keep the helper responsible for supplying the common
mocks and defaults.
In `@src/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskTest.java`:
- Around line 130-146: The LlmTask constructor test is wiring the new
dynamic-agent dependencies with trailing nulls, which weakens the contract check
and hides parameter reordering mistakes. Update LlmTaskTest to pass explicit
typed mocks for the five new constructor arguments instead of null, and
initialize those mocks alongside the existing dependencies so the instantiation
of LlmTask remains compile-time safe and order-sensitive.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: e97a5fe9-6129-4099-8c40-5b3e1040106e
📒 Files selected for processing (44)
AGENTS.mdHANDOFF.mdREADME.mddocs/architecture.mddocs/changelog.mddocs/group-conversations.mddocs/mcp-server.mddocs/slack-integration.mdsrc/main/java/ai/labs/eddi/configs/groups/model/AgentGroupConfiguration.javasrc/main/java/ai/labs/eddi/configs/groups/model/DiscussionStylePresets.javasrc/main/java/ai/labs/eddi/configs/groups/model/GroupConversation.javasrc/main/java/ai/labs/eddi/configs/groups/model/SharedTaskList.javasrc/main/java/ai/labs/eddi/configs/groups/rest/RestAgentGroupStore.javasrc/main/java/ai/labs/eddi/engine/api/IGroupConversationService.javasrc/main/java/ai/labs/eddi/engine/internal/GroupConversationService.javasrc/main/java/ai/labs/eddi/engine/internal/RestGroupConversation.javasrc/main/java/ai/labs/eddi/engine/internal/TaskListParser.javasrc/main/java/ai/labs/eddi/engine/lifecycle/GroupConversationEventSink.javasrc/main/java/ai/labs/eddi/engine/mcp/McpGroupTools.javasrc/main/java/ai/labs/eddi/integrations/slack/SlackGroupDiscussionListener.javasrc/main/java/ai/labs/eddi/modules/llm/impl/AgentOrchestrator.javasrc/main/java/ai/labs/eddi/modules/llm/impl/LlmTask.javasrc/main/java/ai/labs/eddi/modules/llm/tools/ConverseWithAgentTool.javasrc/main/java/ai/labs/eddi/modules/llm/tools/CreateSubAgentTool.javasrc/main/java/ai/labs/eddi/modules/llm/tools/FindAgentsByCapabilityTool.javasrc/main/java/ai/labs/eddi/modules/llm/tools/TeardownAgentTool.javasrc/test/java/ai/labs/eddi/configs/groups/model/AgentGroupConfigurationTest.javasrc/test/java/ai/labs/eddi/configs/groups/model/DiscussionStylePresetsTest.javasrc/test/java/ai/labs/eddi/configs/groups/model/GroupConversationTest.javasrc/test/java/ai/labs/eddi/configs/groups/model/SharedTaskListTest.javasrc/test/java/ai/labs/eddi/engine/internal/GroupConversationServiceTaskForceTest.javasrc/test/java/ai/labs/eddi/engine/internal/TaskListParserTest.javasrc/test/java/ai/labs/eddi/engine/mcp/McpGroupToolsTest.javasrc/test/java/ai/labs/eddi/modules/llm/impl/AgentOrchestratorBranchTest.javasrc/test/java/ai/labs/eddi/modules/llm/impl/AgentOrchestratorExtendedBranchTest.javasrc/test/java/ai/labs/eddi/modules/llm/impl/AgentOrchestratorExtendedTest.javasrc/test/java/ai/labs/eddi/modules/llm/impl/AgentOrchestratorTest.javasrc/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskBranchTest.javasrc/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskConfigureTest.javasrc/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskDeepBranchTest.javasrc/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskExtendedBranchTest.javasrc/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskExtendedTest.javasrc/test/java/ai/labs/eddi/modules/llm/impl/LlmTaskTest.javasrc/test/java/ai/labs/eddi/modules/llm/tools/DynamicAgentToolsTest.java
- CodeQL: sanitize user-controllable values in log statements (LogSanitizer) - Security: fix model allow-list bypass when provider is omitted (now checks against ALL provider model lists instead of nonexistent 'default' key) - Docs: fix broken markdown table row (literal \r\n merged DELPHI/TASK_FORCE) - Slack: fix inaccurate Javadoc (compact mode IS reachable for CUSTOM style) - Config: null-safe LifecyclePolicy setter (null → EPHEMERAL default) - Test: remove unread variable in SharedTaskListTest - Test: add model allow-list bypass regression test 9,487 tests pass, 0 failures
…orchestration Add 60+ targeted tests covering all uncovered branches: - DynamicAgentToolsTest: initialMessage flow (success, failure, map-format), extractResponse branches, blank params, empty allow-lists, retain=false, null snapshot, mixed output items, general exceptions (+25 tests) - TaskListParserTest: all key aliases (title/name/desc/details/instructions/ agent/assignee/assigned_to), null members/assignedTo, markdown formats (dash/plus/em-dash), long text truncation, null displayName (+22 tests) - SharedTaskListTest: findTasksForAgent(null), wrong status transitions, nonexistent ID exceptions, failTask from various states, setTasks(null), size/isEmpty, nonexistent dependency (+12 tests) - AgentGroupConfigurationTest: LifecyclePolicy toJson/fromJson (all values, null, uppercase, invalid), TaskDefinition constructors (null subject/desc/ dependsOn/assignToRole, convenience), DiscussionPhase requiresApproval (+12) Also includes CI review fixes from previous iteration: - GroupConversationService: wave-based task execution loop - CreateSubAgentTool: case-insensitive model allow-list - TeardownAgentTool: CopyOnWriteArrayList fallback
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/test/java/ai/labs/eddi/modules/llm/tools/DynamicAgentToolsTest.java`:
- Around line 976-982: The test constructor_nullFallbacks in
DynamicAgentToolsTest currently only proves null-safety because
teardownAgent("non-existent", false) returns before any fallback collection
mutation occurs, so immutable defaults would still pass. Either rename the test
to reflect null-safety only, or update it to exercise an observable mutation
path in TeardownAgentTool so the mutable-defaults claim is actually verified.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 9dbc933f-121f-4bf5-9343-2eaa0b8e1bdf
📒 Files selected for processing (8)
docs/changelog.mdsrc/main/java/ai/labs/eddi/engine/internal/GroupConversationService.javasrc/main/java/ai/labs/eddi/modules/llm/tools/CreateSubAgentTool.javasrc/main/java/ai/labs/eddi/modules/llm/tools/TeardownAgentTool.javasrc/test/java/ai/labs/eddi/configs/groups/model/AgentGroupConfigurationTest.javasrc/test/java/ai/labs/eddi/configs/groups/model/SharedTaskListTest.javasrc/test/java/ai/labs/eddi/engine/internal/TaskListParserTest.javasrc/test/java/ai/labs/eddi/modules/llm/tools/DynamicAgentToolsTest.java
✅ Files skipped from review due to trivial changes (1)
- docs/changelog.md
🚧 Files skipped from review as they are similar to previous changes (4)
- src/main/java/ai/labs/eddi/modules/llm/tools/TeardownAgentTool.java
- src/main/java/ai/labs/eddi/modules/llm/tools/CreateSubAgentTool.java
- src/test/java/ai/labs/eddi/configs/groups/model/SharedTaskListTest.java
- src/main/java/ai/labs/eddi/engine/internal/GroupConversationService.java
Address all 5 findings from Copilot and CodeRabbit PR reviews: HIGH — Dynamic agent tracking lists disconnected (Copilot #1 & #3): AgentOrchestrator created fresh sharedCreatedIds/sharedRetainedIds per tool-list build. These were never propagated to GroupConversation, so cleanupEphemeralAgents() could never see created agents. Fix: Store tracking lists as step data in IConversationMemory via well-known keys (dynamic:created_agent_ids, dynamic:retained_agent_ids). Lists are stored by reference — after tool execution they contain all accumulated IDs. GroupConversationService.propagateDynamicAgentTracking() reads them from the snapshot callback and merges into gc tracking lists. MEDIUM — NPE guard for conversationService (Copilot #2): CreateSubAgentTool was enabled when agentSetupService != null but conversationService could be null, causing NPE on initialMessage. Fix: Added conversationService != null guard. MEDIUM — Unused 'members' parameter in TaskListParser.parse() (Copilot #4): Javadoc claimed members was used for assignment resolution but the parameter was unused inside parse(). Fix: Updated Javadoc to clarify members is accepted for API consistency only. MINOR — Test doesn't verify mutable defaults (CodeRabbit): constructor_nullFallbacks test didn't exercise mutation path. Renamed to constructor_nullArgs_doesNotThrow and added unretainAgent assertion.
| * @throws IllegalStateException | ||
| * if the task is not in ASSIGNED status | ||
| */ | ||
| public synchronized TaskItem startTask(String taskId) { |
There was a problem hiding this comment.
I understand why there is a necessity to have those methods synchronized. Can we write tests to specifically rule out deadlocks? These methods are prone to lock each other out.
🎯 TASK_FORCE Discussion Style + Dynamic Agent System
Summary
Adds a new TASK_FORCE discussion style for group conversations that enables structured, multi-phase task orchestration — plus a dynamic agent system allowing LLM agents to create, recruit, converse with, and teardown other agents at runtime.
This is the largest feature addition to EDDI's group conversation system since its initial implementation, touching 44 files with +4,753/-62 lines across 11 commits.
🏗️ What's New
1. TASK_FORCE Discussion Style
A structured, multi-phase orchestration pattern for collaborative problem-solving:
Key capabilities:
subject,description,assignToRole,dependsOn, andprioritySharedTaskList— state machine for task lifecycle (PENDING→ASSIGNED→IN_PROGRESS→COMPLETED/FAILED/BLOCKED)TaskListParser— robust parser for LLM-generated task plans (numbered lists, markdown tables, JSON arrays)CompletableFuture2. Dynamic Agent System
LLM agents can now spin up specialist agents on the fly during group discussions:
CreateSubAgentToolConverseWithAgentToolFindAgentsByCapabilityToolCapabilityRegistryServiceTeardownAgentToolretainAgent/unretainAgentlifecycle controlGuardrails (all config-driven):
maxCreatedAgentsPerDiscussion)EPHEMERAL|KEEP_DEPLOYED|UNDEPLOY_ONLY|AGENT_DECIDESTenantQuotaService3. Tenant Quota Enforcement in Groups
Group conversations now respect tenant quota limits. When quota is exhausted:
onAgentFailurepolicySKIPpolicy properly skips the agent turn rather than crashing📐 Architecture Decisions
@Vetoed, constructed per-invocationCopyOnWriteArrayList/ConcurrentHashMap.newKeySet()LifecyclePolicyas enum with@JsonValue/@JsonCreatorCollections.synchronizedListfor transcript/dynamicMembersextractResponse()returnsnullnot""response == null && state == ERROR)🧪 Testing
9,486 tests pass, 0 failures, 0 errors
New test files (4):
SharedTaskListTestGroupConversationServiceTaskForceTestTaskListParserTestDynamicAgentToolsTestUpdated test files (11):
All existing
AgentOrchestrator,LlmTask, andMcpGroupToolstest files updated for new constructor signatures and CDI dependencies.📋 Commits (11)
97ecf81feat(groups): add TASK_FORCE discussion style40ebf1bfix(groups): code review fixes — thread safety, verification parser, error handling8b15c9dfix(groups): final review — duplicate task bug, regression tests32dc3e5fix(groups): enforce tenant quota abort in group conversations38ce028fix(groups): quota errors propagate regardless of onAgentFailure policy320e246docs(changelog): add SKIP policy review finding26e584afeat(groups): dynamic agent system — create, recruit, delegate at runtimeeaa3867fix(dynamic-agents): critical code review fixes (6 critical, 6 medium)d123784refactor(dynamic-agents): post-review cleanup — dead code, ordering, scope📊 Impact
None. All changes are additive:
TASK_FORCEstyle is opt-in via group configSEQUENTIAL,ROUND_ROBIN,DEBATE, etc.) are unaffectedLifecyclePolicyenum is backward-compatible with existing"ephemeral"JSON strings via@JsonCreatorType of Change
Checklist
./mvnw clean verify -DskipITs)Summary by CodeRabbit