Add an end-to-end agent diagnostics command

## Summary

OpenAgents currently exposes several useful health and status signals, but there does not appear to be a single diagnostic command or API that confirms an agent is truly usable end to end.

A dedicated diagnostics flow would help users distinguish between:

- the daemon process is running
- an agent adapter is marked as `running`
- the workspace sees the agent as `online` via fresh heartbeats
- the configured LLM/provider is reachable
- the specific agent instance can receive a workspace task and successfully send a response

## Current behavior

From reading the codebase, the existing checks are spread across multiple layers:

- `agn status` reports daemon/adapter process state from `daemon.status.json`. This is useful liveness information, but it does not prove the agent can process a task.
- `/v1/health` confirms the workspace backend is reachable.
- `/v1/join`, `/v1/heartbeat`, and `/v1/discover` confirm workspace presence and can mark stale agents offline after heartbeat timeout.
- session rotation / `session_revoked` helps detect duplicate or ghost adapters.
- `/status` can make an agent post uptime/version/network information back into a channel, which is closer to a runtime response check but is not exposed as a unified diagnostic flow.
- `agn test-llm <type>` validates a generic LLM provider configuration, but it is not tied to a specific agent instance or workspace round trip.

## Expected behavior

It would be helpful to have a command and/or API such as:

```bash
agn diagnose <agent-name>
```

or a workspace-level diagnostic endpoint/control action that reports a structured verdict, for example:

```json
{
  "daemon": "ok",
  "adapter_state": "running",
  "workspace_health": "ok",
  "workspace_presence": "online",
  "heartbeat_age_seconds": 12,
  "session": "valid",
  "llm": "ok",
  "workspace_round_trip": "ok",
  "agent_response": "ok"
}
```

## Suggested checks

A first version could combine existing building blocks:

1. Check daemon liveness and local adapter state.
2. Check runtime readiness from the registry `check_ready` / health check metadata.
3. Check workspace backend `/v1/health`.
4. Check `/v1/discover` for the target agent and verify fresh heartbeat.
5. Check session validity / detect `session_revoked` where possible.
6. Optionally run the existing LLM smoke test for that agent type.
7. Send a lightweight diagnostic control action to the target agent and verify it posts a response within a timeout.

## Benefits

- Makes "agent is running but not responding" problems easier to debug.
- Gives users a single command to confirm whether an agent is actually usable.
- Helps separate local daemon issues from workspace connectivity, session conflicts, credential problems, and runtime/tool failures.
- Provides a better support artifact for bug reports than screenshots of multiple status surfaces.

## Notes

This is a feature request rather than a bug report. The existing status surfaces are useful, but they currently need to be manually combined to answer the question: "Can this agent actually receive and complete work right now?"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add an end-to-end agent diagnostics command #485

Summary

Current behavior

Expected behavior

Suggested checks

Benefits

Notes

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Add an end-to-end agent diagnostics command #485

Description

Summary

Current behavior

Expected behavior

Suggested checks

Benefits

Notes

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions