Skip to content

MCP run_tests tool — on-demand test execution for agents #19

@hellno

Description

@hellno

Summary

Add a run_tests MCP tool that lets AI agents trigger a project's test suite on-demand and receive structured pass/fail results. Today agents can deploy and check logs, but they have no way to run tests and iterate based on results — the tightest feedback loop for producing correct code.

Motivation

Agents produce significantly better work when given immediate, deterministic feedback. Currently the only "backpressure" Jack provides is deploy success/failure and log output. A structured test runner would let agents:

  • Write code → run tests → fix failures → repeat, without human intervention
  • Validate changes before deploying (shift-left)
  • Get machine-readable results (not just log text) to reason about failures precisely

Proposed behavior

MCP tool: run_tests

Input:
  - project_id (optional, defaults to current project)
  - test_command (optional, auto-detect from package.json scripts)
  - filter (optional, run specific test files/patterns)

Output:
  - success: boolean
  - summary: { total, passed, failed, skipped }
  - failures: [{ test_name, file, error_message, diff? }]
  - duration_ms: number
  - raw_output: string (truncated)

Detection logic

  • Check package.json for test, test:unit, test:integration scripts
  • Support common runners: vitest, jest, bun test, playwright
  • Parse structured output (JSON reporters) when available, fall back to stdout parsing

Execution

  • BYO mode: Run locally via shell
  • Managed mode: Run in Jack Cloud sandbox (requires compute allocation — could be a follow-up)

Acceptance criteria

  • run_tests MCP tool callable from Claude Code / Claude Desktop
  • Auto-detects test command from project config
  • Returns structured results (not just raw text)
  • Handles timeout gracefully (default 60s, configurable)
  • Works in BYO mode; managed mode can return "not yet supported" initially

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions