Skip to content

Add verbose per-step agent execution tracing (#5817)#5820

Open
devin-ai-integration[bot] wants to merge 2 commits into
mainfrom
devin/1778855193-verbose-tracing
Open

Add verbose per-step agent execution tracing (#5817)#5820
devin-ai-integration[bot] wants to merge 2 commits into
mainfrom
devin/1778855193-verbose-tracing

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot commented May 15, 2026

Summary

Implements the feature requested in #5817: structured per-step verbose logging for crew execution tracing.

When verbose=True is set on a Crew, the following structured logs are now emitted:

[crew] Starting crew execution
[Agent: researcher] Starting task: research_task
[Agent: researcher] Task complete: research_task, passing to: writer
[Agent: writer] Received context from researcher
[Agent: writer] Starting task: write_article
[Agent: writer] Task complete: write_article
[crew] Crew execution completed

Changes in lib/crewai/src/crewai/crew.py:

  • _log_crew_start() / _log_crew_finish() — emit crew-level start/finish logs (uses crew.name, defaults to \"crew\")
  • Enhanced _log_task_start() — now also emits structured [Agent: X] Starting task: Y verbose log (uses task.name if set, falls back to task.description)
  • _log_task_completion() — emits [Agent: X] Task complete: Y with optional passing to: Z when the next agent differs
  • _log_context_received() — emits [Agent: Y] Received context from X at handoff points
  • _get_next_agent_role() — helper to look ahead at the next task's agent
  • Updated _execute_tasks() to track previous_agent_role and call the new logging methods
  • Updated _process_async_tasks() to emit completion/handoff logs for async tasks too
  • Updated tuple types for futures to carry agent_role and next_agent_role

All new logs go through the existing Logger class and respect the verbose flag — nothing is printed when verbose=False.

Supersedes #5819 (closed due to repo restructuring).

Review & Testing Checklist for Human

  • Run Crew(verbose=True) with a multi-agent sequential workflow and verify structured logs appear in the expected format
  • Verify Crew(verbose=False) produces no new output
  • Confirm existing verbose output (Agent/Task/Thought/Tool/Final Answer blocks from CrewAgentExecutor) still appears alongside the new structured logs

Notes

  • 7 new unit tests added in lib/crewai/tests/test_crew.py covering: crew start/finish logs, task start/completion, handoff logs, no-handoff for same agent, custom crew name, and task name fallback
  • All tests use mocks (no LLM calls) for fast, reliable execution
  • The existing test_crew_verbose_output test continues to pass unchanged

Link to Devin session: https://app.devin.ai/sessions/3893287f119348278ee6998438d33b0f

Summary by CodeRabbit

  • New Features

    • Enhanced verbose logging showing crew start and finish, per-task start/completion, and task handoffs when agents change.
    • Logs include crew and task names and indicate when context is passed between agents.
  • Tests

    • Added tests verifying verbose logging behavior, handoff/no-handoff cases, and name/display fallbacks.

Review Change Stack

- Add crew-level start/finish logs: [crew_name] Starting/Completed
- Add per-task start/completion logs: [Agent: X] Starting/Task complete
- Add handoff logs between agents: passing to / Received context from
- Support custom crew name in logs
- Use task.name when available, fallback to description
- Add completion/handoff logs for async tasks
- Add 7 unit tests covering all new verbose tracing behavior

Co-Authored-By: João <joao@crewai.com>
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Prompt hidden (unlisted session)

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 071bb52f-f329-44e9-a194-30c3f3396959

📥 Commits

Reviewing files that changed from the base of the PR and between bb935d3 and 29fff82.

📒 Files selected for processing (1)
  • lib/crewai/src/crewai/crew.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • lib/crewai/src/crewai/crew.py

📝 Walkthrough

Walkthrough

This PR adds verbose logging to Crew execution: crew start/finish messages, per-task start/completion logs, agent handoff detection via next-agent-role computation, expanded futures metadata for async tasks, and tests validating verbose output.

Changes

Verbose Logging for Crew Execution

Layer / File(s) Summary
Logging and role computation helpers
lib/crewai/src/crewai/crew.py
New private methods _log_crew_start(), _log_crew_finish(), _log_task_completion(), _log_context_received(), and _get_next_agent_role() provide centralized logging and next-agent-role computation for handoff messaging.
Crew lifecycle logging in kickoff()
lib/crewai/src/crewai/crew.py
Crew.kickoff() now calls _log_crew_start() at the start and _log_crew_finish() before returning results to emit crew lifecycle messages when verbose is enabled.
Role tracking and futures metadata
lib/crewai/src/crewai/crew.py
The futures tuple structure is expanded to include (agent_role, next_agent_role) metadata; _execute_tasks() computes next agent roles to support handoff logging.
Synchronous execution with role transitions
lib/crewai/src/crewai/crew.py
Sync execution tracks previous agent roles, logs when a task receives context from a different agent, logs task completion with optional "passing to" agent, and updates role state for subsequent tasks.
Async and conditional execution updates
lib/crewai/src/crewai/crew.py
_handle_conditional_task() and _process_async_tasks() signatures accept the expanded futures tuple and async completions are logged using role/next-role metadata.
Verbose logging test suite
lib/crewai/tests/test_crew.py
Seven new pytest functions validate crew start/finish logging, per-task logging, handoff logs when agents change, absence of handoff logs when same agent runs consecutively, and inclusion of crew/task names in log messages.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Crew.kickoff()
  participant Log as _log_* helpers
  participant Exec as _execute_tasks / _process_async_tasks
  participant Agent as Agent (by role)

  Client->>Log: _log_crew_start()
  Client->>Exec: execute tasks (verbose, role tracking)
  Exec->>Exec: _get_next_agent_role()
  Exec->>Agent: run task (agent_role)
  Agent-->>Exec: task completes (result)
  Exec->>Log: _log_task_completion(role, task, next_agent_role)
  Exec->>Log: _log_context_received(role, from_role) when role changed
  Exec-->>Client: task results
  Client->>Log: _log_crew_finish()
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Poem

🐰 I hopped through logs at start and end,
Each task I watched, each handoff penned,
When roles changed paws passed the note,
The verbose trail helped me stay afloat,
A happy rabbit logging every friend.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: adding verbose per-step agent execution tracing with structured logging for Crew execution.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch devin/1778855193-verbose-tracing

Comment @coderabbitai help to get the list of available commands and usage tips.

Co-Authored-By: João <joao@crewai.com>
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
lib/crewai/src/crewai/crew.py (2)

1469-1528: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Don't infer handoffs from adjacency alone.

previous_agent_role / _get_next_agent_role() treat neighboring tasks as a real transfer, but async tasks only receive last_sync_output, and conditional tasks may be skipped entirely. That can produce passing to ... / Received context from ... logs for handoffs that never actually happened.


955-973: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Mirror the new verbose tracing in akickoff().

This only wires the crew/task completion tracing through the sync path. Crew.akickoff() still goes through _aexecute_tasks() / _aprocess_async_tasks() without the new crew start/finish or completion/handoff logs, so verbose=True behaves differently on the native async API.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/src/crewai/crew.py` around lines 955 - 973, The async kickoff path
(akickoff) needs to mirror the synchronous kickoff: call self._log_crew_start()
at the start, await the async process/run (use _aexecute_tasks() /
_aprocess_async_tasks() or the async equivalents of
_run_sequential_process/_run_hierarchical_process), then invoke each
after_kickoff_callbacks (await any coroutine callbacks), run
self._post_kickoff(result), update self.usage_metrics =
self.calculate_usage_metrics(), and finally call self._log_crew_finish(); also
ensure any task completion/handoff tracing calls used in the sync flow are
invoked in the async flow so verbose=True produces identical trace output.
🧹 Nitpick comments (1)
lib/crewai/tests/test_crew.py (1)

4999-5239: ⚡ Quick win

Add regression coverage for async tracing paths.

These assertions only exercise sync kickoff() or call helpers directly, so they won't catch false handoffs between consecutive async tasks or the missing completion/handoff logs on akickoff().

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@lib/crewai/tests/test_crew.py` around lines 4999 - 5239, Add async-path tests
mirroring the existing verbose sync tests to cover Crew.akickoff(): for cases
asserting crew start/finish, per-task start/complete, handoff between agents, no
handoff for same agent, crew name/task name logging, patch Task.execute_async
with AsyncMock (or patch.object(Task, "execute_async", side_effect=[...])
returning TaskOutput) and run the async flow with pytest.mark.asyncio and await
crew.akickoff(), then capture and strip ANSI codes and assert the same log
strings (e.g., use Crew.akickoff, Task.execute_async,
Crew._log_crew_start/_log_crew_finish, Crew._log_task_start) so false handoffs
or missing async logs are detected.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@lib/crewai/src/crewai/crew.py`:
- Around line 955-973: The async kickoff path (akickoff) needs to mirror the
synchronous kickoff: call self._log_crew_start() at the start, await the async
process/run (use _aexecute_tasks() / _aprocess_async_tasks() or the async
equivalents of _run_sequential_process/_run_hierarchical_process), then invoke
each after_kickoff_callbacks (await any coroutine callbacks), run
self._post_kickoff(result), update self.usage_metrics =
self.calculate_usage_metrics(), and finally call self._log_crew_finish(); also
ensure any task completion/handoff tracing calls used in the sync flow are
invoked in the async flow so verbose=True produces identical trace output.

---

Nitpick comments:
In `@lib/crewai/tests/test_crew.py`:
- Around line 4999-5239: Add async-path tests mirroring the existing verbose
sync tests to cover Crew.akickoff(): for cases asserting crew start/finish,
per-task start/complete, handoff between agents, no handoff for same agent, crew
name/task name logging, patch Task.execute_async with AsyncMock (or
patch.object(Task, "execute_async", side_effect=[...]) returning TaskOutput) and
run the async flow with pytest.mark.asyncio and await crew.akickoff(), then
capture and strip ANSI codes and assert the same log strings (e.g., use
Crew.akickoff, Task.execute_async, Crew._log_crew_start/_log_crew_finish,
Crew._log_task_start) so false handoffs or missing async logs are detected.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: d1de7314-a69b-4b36-83a5-88b7c53dc53e

📥 Commits

Reviewing files that changed from the base of the PR and between 75bb882 and bb935d3.

📒 Files selected for processing (2)
  • lib/crewai/src/crewai/crew.py
  • lib/crewai/tests/test_crew.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants