Skip to content

Write structured crash reason file on fatal service exit #1154

@geoffjay

Description

@geoffjay

Summary

Each service should persist a structured crash reason to a well-known file on fatal exit, enabling diagnostic tools to quickly surface why a service is down without parsing raw logs.

Problem

When a service crashes (e.g., due to a migration mismatch), the only record is in the stderr log. Diagnostic tools and other services have no structured way to determine the cause of a crash. This is especially problematic when:

  • Log files are rotated or truncated
  • The crash happens during startup before logging is fully initialized
  • Multiple restart attempts fill the log with repeated errors

Proposed Solution

  • On fatal exit, each service writes a JSON file to a standard location, e.g.:
    ~/Library/Application Support/agentd-<service>/last-error.json

  • Format:

    {
      "timestamp": "2026-04-17T19:15:49Z",
      "error": "Migration file of version 'm20260329_000012_create_projects_table' is missing",
      "category": "migration",
      "exit_code": 1
    }
  • The file is overwritten on each crash, so it always reflects the most recent failure.

  • On successful startup, the file is deleted (or a last-healthy.json is written alongside it).

Acceptance Criteria

  • All services write last-error.json on fatal exit
  • File is written to a standard, discoverable location
  • File is cleared or replaced on successful startup
  • MCP diagnostic tools read this file when checking service health

Metadata

Metadata

Assignees

No one assigned

    Labels

    complexity:mediumMedium scope: <200 lines, 1-2 filesenhancementNew feature or requesttriagedIssue has been triaged, ready for planning or implementation

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions