Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions docs/showcases/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,19 @@ commit-backed self-iteration case, and one contributor-approved interactive
workflow case that shows how LoopX coordinates generated scripts and worker
agents under a shared control plane.

## Additional Public Evidence Cases

| Case | Pattern | Status | Public Surface |
| --- | --- | --- | --- |
| [0623 agent-to-agent PR comment and fix loop](cases/0623-agent-to-agent-pr-comments.md) | Agent handoff, PR comment loop, review packet | Public-safe pattern case | Redacted lifecycle narrative |
| [0623 overnight project refactor](cases/0623-overnight-project-refactor.md) | PR-sized slices, todo follow-up, supersede | Public-safe pattern case | Redacted lifecycle narrative |
| [0624 PR issue automatic fix loop](cases/0624-pr-issue-auto-fix.md) | Issue-fix workflow, repro smoke, reviewer handoff | Public-safe pattern case | Redacted workflow narrative |
| [0627 overnight PR batch with reviewable control](cases/0627-overnight-pr-batch.md) | PR-sized slices, validation writeback, public-boundary discipline | Public Git evidence case | 22 merged commits over a 10-hour public Git window |

Additional evidence cases stay in the catalog as appendix surfaces, but they
are not part of the first three canonical PoC cards until they gain a
reproducible demo or a deeper public evidence packet.

## Appendix Cases

| Case | Pattern | Status | Public Surface |
Expand Down
58 changes: 58 additions & 0 deletions docs/showcases/cases/0623-agent-to-agent-pr-comments.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# 0623: Agent-To-Agent PR Comment And Fix Loop

## Summary

This case captures a public-safe version of a multi-agent review loop: one
agent lane can notice or respond to PR review feedback, while another lane
keeps the implementation and fix evidence reviewable. The important behavior is
not the chat transcript. It is the control-plane loop around a PR: comment,
handoff, fix, validation, and review packet.

The original evidence included operator-side screenshots and review context, so
this repository keeps only the reusable pattern. Public PR surfaces can be used
as evidence, but raw screenshots and private coordination details stay out of
the repo.

## Pattern

A review comment is a good boundary object for long-running agents:

- it is concrete enough to turn into a todo;
- it belongs to a public or reviewable PR surface;
- it can be routed to the agent that owns the implementation lane;
- it can be closed only after a fix and validation are visible.

LoopX keeps that flow explicit instead of relying on a human to remember which
agent saw the comment.

## LoopX Behavior

LoopX contributes the following control-plane pieces:

- a claimed todo names the PR feedback or comment thread;
- a handoff gate keeps the blocked agent from guessing outside its lane;
- the implementation agent records the fix and validation evidence;
- the review packet points the reviewer back to the public PR surface;
- follow-up work becomes a successor todo rather than a loose chat note.

## User-Facing Value

The operator does not need to manually shepherd every PR comment across agent
threads. LoopX turns review feedback into a bounded work item with owner,
evidence, and handoff state. That makes agent-to-agent collaboration useful
without hiding the final review responsibility.

## Evidence Boundary

This case excludes private screenshots, raw chats, internal review notes, local
state, credentials, and unpublished artifacts. The public-safe evidence shape
is the PR comment/fix lifecycle itself: a visible PR surface, a claimed todo,
the fix diff, validation output, and the resulting review packet.

## Website Story Beats

1. A PR receives feedback that should become executable work.
2. LoopX turns the feedback into an owned todo instead of a chat reminder.
3. Another agent lane implements or verifies the fix.
4. The review packet links the comment, fix, and validation evidence.
5. Follow-up work remains explicit as successor todos.
58 changes: 58 additions & 0 deletions docs/showcases/cases/0623-overnight-project-refactor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
# 0623: Overnight Project Refactor As PR-Sized Slices

## Summary

This case captures a long unattended refactor that stayed reviewable because
LoopX kept splitting the work into bounded PR-sized slices. The reusable lesson
is that autonomous refactoring should not land as one huge diff. It should keep
todo follow-up, supersede decisions, validation, and review boundaries visible.

The source note described an overnight refactor wave. This repository records
the public-safe control-plane pattern rather than private screenshots or local
project state.

## Before

Large refactors are a bad fit for naive autonomous loops. Without a control
plane, an agent can keep editing after the original plan is stale, mix cleanup
with behavior changes, or produce a broad diff that is hard to review.

The desired behavior is:

1. keep the goal and current slice explicit;
2. finish one reviewable unit at a time;
3. create follow-up todos for remaining work;
4. supersede stale todos when the refactor discovers a better route;
5. validate each slice before merge or handoff.

## LoopX Behavior

LoopX makes that refactor loop durable:

- `todo follow-up` turns discoveries into the next concrete slice;
- `supersede` prevents stale tasks from staying runnable;
- quota and status keep the current slice separate from adjacent cleanup;
- review packets and focused smokes keep each PR independently checkable;
- public/private boundary scans prevent local planning material from leaking
into public docs.

## User-Facing Value

The operator can let a refactor continue overnight while still waking up to
reviewable units. The project moves faster, but the review surface remains
human-sized.

## Evidence Boundary

This case excludes private screenshots, raw chats, internal planning notes,
local paths, credentials, raw logs, and unpublished project artifacts. Public
evidence should come from the resulting PR-sized diffs, validation commands,
and follow-up/supersede state, not from raw agent traces.

## Website Story Beats

1. A broad refactor starts as a long-running goal.
2. LoopX keeps the current slice explicit.
3. Follow-up and supersede convert discoveries into reviewable next steps.
4. Each slice gets validation and a review packet.
5. The operator reviews bounded PRs instead of a giant autonomous diff.
59 changes: 59 additions & 0 deletions docs/showcases/cases/0624-pr-issue-auto-fix.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# 0624: PR Issue Automatic Fix Loop

## Summary

This case captures the issue-to-fix loop: review feedback, issue text, or a PR
comment should become an executable repair plan with a repro or focused smoke,
not an informal note in chat. LoopX turns that signal into a bounded workflow
that can classify the problem, prepare a branch, implement a fix, validate it,
and report the result back to the review surface.

The original showcase included private visual evidence. This public case keeps
only the reusable product pattern and the repository surfaces that support it.

## Pattern

Automatic issue fixing needs more than "read the issue and edit files." A safe
workflow needs to:

- classify whether the issue body or review comment is enough to act on;
- create or identify a focused reproduction path;
- keep private or gated issue bodies out of public fixtures;
- make the implementation branch explicit;
- run a small validation command before reporting success;
- record any unresolved reviewer decision as a concrete todo.

## LoopX Behavior

LoopX supports the loop with issue-fix planning and command-pack style
contracts:

- the initial signal becomes ordered todos rather than prose;
- gated reads remain explicit when a body or comment is not safe to consume;
- implementation and validation steps stay separate;
- review feedback can create a successor todo instead of being lost after a PR
comment;
- the final packet records what was fixed, what was validated, and what still
needs a reviewer.

## User-Facing Value

The operator can point LoopX at a review issue and expect a controlled repair
loop: understand the request, create a repro, implement the fix, validate it,
and surface remaining review decisions. The user does not have to translate
every PR comment into a manual agent prompt.

## Evidence Boundary

This case excludes private screenshots, raw issue bodies from gated sources,
internal review notes, local paths, raw logs, credentials, and unpublished
repository artifacts. Public evidence should be the sanitized workflow plan,
focused smoke, branch diff, and public PR review outcome.

## Website Story Beats

1. A PR issue or review comment appears.
2. LoopX classifies the issue and creates ordered repair todos.
3. The agent builds or finds a focused repro.
4. The fix lands as a reviewable branch diff with validation.
5. Remaining reviewer decisions are written back as concrete todos.
102 changes: 102 additions & 0 deletions docs/showcases/cases/0627-overnight-pr-batch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,102 @@
# 0627: Overnight PR Batch With Reviewable Control

## Summary

LoopX produced an overnight burst of public repository progress without turning
the project into an unreadable pile of agent output. In the ten-hour public Git
window from `2026-06-27 01:29 +08:00` to `2026-06-27 11:29 +08:00`, the public
repository advanced by 22 merged commits touching 60 files, with 6695 insertions
and 223 deletions.

This case is useful because the signal is PR-shaped and reviewable. The work
landed as small slices across docs, state projection, issue-fix workflow,
event-sourced state, benchmark launch contracts, status/quota smokes, and
release/runtime guardrails. LoopX did not make a single giant change that a
maintainer had to trust blindly.

The public case deliberately uses merged Git history as the evidence floor. The
operator-side note also tracked a larger contemporaneous PR queue, but this
page only claims what the public repository can support.

## Public Repository Signal

The evidence window is anchored to public Git history and can be reproduced
locally:

```bash
git log --since="2026-06-27T01:29:00+08:00" \
--until="2026-06-27T11:29:00+08:00" --oneline

git log --since="2026-06-27T01:29:00+08:00" \
--until="2026-06-27T11:29:00+08:00" --numstat
```

| Signal | Value |
| --- | --- |
| Public evidence window | 2026-06-27 01:29 +08:00 to 2026-06-27 11:29 +08:00 |
| Merged commits in window | 22 |
| Unique files touched | 60 |
| Public insertions / deletions | 6695 / 223 |
| Commit messages with explicit PR numbers | 10 |
| Evidence floor | Public Git history only |

Representative merged slices in the window included:

- issue-fix workflow planning and command-pack guidance;
- event-sourced LoopX state contracts, API, compaction, and downstream read
path checks;
- Terminal-Bench and SkillsBench launch or prerequisite contracts;
- status/quota performance budget and projection smokes;
- rollout-state documentation and README workflow refinement;
- agent-scope wait scheduler progression.

## LoopX Behavior

The product behavior was not "make more commits." The useful behavior was that
high-throughput work stayed bounded and reviewable:

- each slice remained small enough to review as a PR or PR-sized commit;
- public docs, examples, and runtime code moved together when the contract
changed;
- focused smokes validated reusable control-plane behavior instead of
preserving raw run traces;
- self-merge stayed limited to narrow validated changes;
- broader review gates and handoffs remained visible instead of being hidden
behind the throughput number;
- public/private boundary checks kept internal screenshots, local state, raw
logs, and private planning out of the repository.

## User-Facing Value

For an operator, this case shows a different shape of agent productivity:
overnight progress can be high-throughput without becoming high-risk. The user
can wake up to a batch of merged, reviewable public slices, while the control
plane still records what changed, which validations ran, which gates remained,
and which evidence is safe to publish.

For an agent-platform developer, the reusable pattern is a PR-scale work loop:
LoopX keeps each lane tied to todo ownership, validation, review policy, and
public evidence, so a long-running agent team can move quickly without relying
on chat memory or private screenshots.

## Evidence Boundary

This case intentionally excludes private workspace state, internal documents,
screenshots, raw chats, local paths, raw benchmark logs, credentials, and any
unpublished operator notes. The public evidence floor is Git history and the
public repository surfaces it changed.

The 22-commit window is not a universal productivity benchmark. It is a
showcase of reviewable control-plane throughput in one public repository at one
point in time. Future versions can strengthen the case by linking each public
PR number to its validation evidence and review outcome.

## Website Story Beats

1. A long-running LoopX project enters an overnight autonomous work window.
2. Many small slices land across runtime, docs, benchmark contracts, smokes, and
state projection.
3. LoopX keeps each slice tied to todo ownership, validation, and review policy.
4. The operator sees public Git evidence instead of raw agent logs.
5. The evidence boundary keeps private screenshots and internal planning out of
the showcase.
Loading
Loading