Skip to content

Add trajectory debug export#30

Open
ceej640 wants to merge 3 commits into
gently-project:developmentfrom
ceej640:ceej/issue-10-debug-export
Open

Add trajectory debug export#30
ceej640 wants to merge 3 commits into
gently-project:developmentfrom
ceej640:ceej/issue-10-debug-export

Conversation

@ceej640

@ceej640 ceej640 commented May 31, 2026

Copy link
Copy Markdown
Collaborator

Addresses #10.

Summary:

  • Add python -m gently.debug to create trajectory-debugging bundles for a session id, prefix, or session directory.
  • Summarize session artifacts and write transcript excerpts without copying large image/volume payloads.
  • Infer relevant source files from tool calls in decision/interaction logs.
  • Add a coding-agent debugging prompt template, docs, and unit tests.

Verification:

  • .\.venv\Scripts\python.exe -m pytest tests/test_debug_export.py -q -p no:cacheprovider
  • .\.venv\Scripts\python.exe -m gently.debug --help
  • git diff --check

@pskeshu

pskeshu commented Jun 1, 2026 via email

Copy link
Copy Markdown
Collaborator

@ceej640

ceej640 commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator Author

Yes. A debug bundle is much more useful if it includes profiling/timeline information, not just the exported artifacts after the fact.

The current PR helps package session evidence for debugging, but it does not yet instrument runtime spans. A follow-up should add a low-overhead profiler/trace log with correlation IDs across:

  • LLM calls, including latency and token/model metadata where available
  • tool calls and tool results
  • hardware queue/device waits
  • perception/image-processing steps
  • file I/O and datastore writes
  • UI/WebSocket events
  • errors, retries, and cancellation paths

Then python -m gently.debug can include a timeline/table summary showing which subsystem did what and when, plus the raw span log for deeper analysis. That would answer the profiler question cleanly without overloading this export-only PR.

@ceej640

ceej640 commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator Author

Follow-up implemented from this thread in commit 2d66534.

I added profiler/timeline support to debug exports at the bundle layer:

  • The exporter now discovers profile.jsonl and profile_spans.jsonl in a session.
  • It writes profile_summary.json with span count, duration by component, and slowest spans.
  • debug_context.md includes a Profile Summary section so a coding agent can see which subsystem did what and when.
  • The span parser is permissive enough for LLM calls, tool calls, hardware waits, perception, file I/O, and UI/WebSocket events.

This does not instrument every runtime subsystem yet; it makes the debug bundle ready to consume those profiler spans.

Verification:

  • pytest tests/test_debug_export.py -q -p no:cacheprovider
  • python -m gently.debug --help
  • git diff --check

@ceej640

ceej640 commented Jun 1, 2026

Copy link
Copy Markdown
Collaborator Author

Follow-up implemented from the profiler thread in commit 993cb28.

What changed:

  • Added a small profiler span writer in gently/debug/profiler.py.
  • Instrumented ToolRegistry.execute() to record tool-call spans with component/name/start/end/duration/status/error metadata.
  • Span output goes to the active FileStore session as profile_spans.jsonl when available, or to GENTLY_PROFILE_PATH when that environment variable is set.
  • Updated trajectory-debugging docs and tests so debug exports can consume the runtime spans added here.

This still is not full-system profiling across LLM calls, hardware waits, perception, file I/O, and UI/WebSocket events, but it starts the runtime instrumentation at the common tool execution layer.

Verification:

  • non-writing compile check for profiler, registry, and debug analyzer modules
  • pytest tests/test_tool_registry.py tests/test_debug_export.py -q -p no:cacheprovider
  • git diff --check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants