Skip to content

CopilotCLIAdapter.execute(): per-token message_delta events joined with '\n' → response text comes out one token per line #6

Description

@ScottRBK

Summary

CopilotCLIAdapter.execute() reconstructs the final response text by joining every
type="text" StreamEvent with "\n". For Copilot those events are per-token
deltas
(assistant.message_deltadeltaContent), so the joined result is the
assistant's message exploded to one token/sub-word per line instead of the
original prose.

Affected version: agent-shell-py==0.1.15.

Symptom

AgentResponse.response for a Copilot run comes back as hundreds of near-empty
lines. Reconstructing "Running a quick connectivity check to Forgetful" produces:

Running
 a
 quick
 connectivity
 check
 to
 Forget
ful

(Note Forget + ful arriving as two separate deltas — confirming these are
token-level, not message-level, chunks.)

Root cause

Two lines, both in agent_shell/adapters/copilot_cli_adapter.py:

  1. _parse_event() maps each per-token delta to a type="text" event (~L231-234):

    elif t == "assistant.message_delta":
        delta_content = event.get("data", {}).get("deltaContent", "")
        if delta_content:
            events.append(StreamEvent(type="text", content=delta_content))
  2. execute() aggregates those events with a newline separator (~L62):

    text = "\n".join(e.content for e in chunks if e.type == "text")

deltaContent is defined by the Copilot SDK as an incremental chunk to append
(the canonical consumer is stdout.write(delta) — direct concatenation, no
separator). Joining token deltas with "\n" is therefore wrong for this adapter.

Why this is Copilot-specific

The "\n".join(...) line is shared boilerplate across all adapters
(claude_code_adapter.py, opencode_adapter.py, codex_adapter.py), but it is only
harmful for Copilot:

  • Claude Code / OpenCode / Codex emit type="text" events built from whole
    message/content blocks, so "\n".join just separates distinct blocks — correct.
  • Copilot is the only adapter emitting token-granular deltas as type="text".

pi_adapter.py already documents this exact trap and deliberately avoids it — it
surfaces text on the block-level text_end event, with the comment:

"Text and thinking are surfaced on their _end event (full block). Streaming the
per-token deltas instead would corrupt execute()'s newline-join of text."

Copilot's adapter didn't follow that convention.

Impact

  • AgentResponse.response is unusable as prose for Copilot runs — any consumer that
    prints or parses it gets a token-per-line wall of text.
  • Downstream in eval-harness this feeds a print(response.response), filling
    per-agent logs with token-per-line output (worst on Copilot).

Suggested fix

⚠️ Do not change the shared "\n".join to "".join globally — that would
regress the block-level adapters (distinct text blocks would run together with no
separation).

Two adapter-local options, in order of preference:

  1. Coalesce deltas into whole-message text events (mirrors pi_adapter): keep
    the token deltas for the live stream() path if desired, but emit a single
    consolidated type="text" event per assistant message so execute()'s
    "\n".join stays correct for every adapter. If the Copilot assistant.message
    event carries the full text, read it there instead of accumulating deltas.

  2. Minimal, Copilot-only: since each adapter has its own execute(), change
    only Copilot's aggregation to concatenate directly:

    text = "".join(e.content for e in chunks if e.type == "text")

    Simpler, but note it concatenates distinct assistant messages within a turn with
    no separator (acceptable given deltas have no inherent line breaks, but option 1
    is cleaner and keeps all adapters consistent).

A regression test should assert that a Copilot delta stream like
["Hello", " ", "world"] reconstructs to "Hello world", not "Hello\n \nworld".

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions