Summary
CopilotCLIAdapter.execute() reconstructs the final response text by joining every
type="text" StreamEvent with "\n". For Copilot those events are per-token
deltas (assistant.message_delta → deltaContent), so the joined result is the
assistant's message exploded to one token/sub-word per line instead of the
original prose.
Affected version: agent-shell-py==0.1.15.
Symptom
AgentResponse.response for a Copilot run comes back as hundreds of near-empty
lines. Reconstructing "Running a quick connectivity check to Forgetful" produces:
Running
a
quick
connectivity
check
to
Forget
ful
(Note Forget + ful arriving as two separate deltas — confirming these are
token-level, not message-level, chunks.)
Root cause
Two lines, both in agent_shell/adapters/copilot_cli_adapter.py:
-
_parse_event() maps each per-token delta to a type="text" event (~L231-234):
elif t == "assistant.message_delta":
delta_content = event.get("data", {}).get("deltaContent", "")
if delta_content:
events.append(StreamEvent(type="text", content=delta_content))
-
execute() aggregates those events with a newline separator (~L62):
text = "\n".join(e.content for e in chunks if e.type == "text")
deltaContent is defined by the Copilot SDK as an incremental chunk to append
(the canonical consumer is stdout.write(delta) — direct concatenation, no
separator). Joining token deltas with "\n" is therefore wrong for this adapter.
Why this is Copilot-specific
The "\n".join(...) line is shared boilerplate across all adapters
(claude_code_adapter.py, opencode_adapter.py, codex_adapter.py), but it is only
harmful for Copilot:
- Claude Code / OpenCode / Codex emit
type="text" events built from whole
message/content blocks, so "\n".join just separates distinct blocks — correct.
- Copilot is the only adapter emitting token-granular deltas as
type="text".
pi_adapter.py already documents this exact trap and deliberately avoids it — it
surfaces text on the block-level text_end event, with the comment:
"Text and thinking are surfaced on their _end event (full block). Streaming the
per-token deltas instead would corrupt execute()'s newline-join of text."
Copilot's adapter didn't follow that convention.
Impact
AgentResponse.response is unusable as prose for Copilot runs — any consumer that
prints or parses it gets a token-per-line wall of text.
- Downstream in eval-harness this feeds a
print(response.response), filling
per-agent logs with token-per-line output (worst on Copilot).
Suggested fix
⚠️ Do not change the shared "\n".join to "".join globally — that would
regress the block-level adapters (distinct text blocks would run together with no
separation).
Two adapter-local options, in order of preference:
-
Coalesce deltas into whole-message text events (mirrors pi_adapter): keep
the token deltas for the live stream() path if desired, but emit a single
consolidated type="text" event per assistant message so execute()'s
"\n".join stays correct for every adapter. If the Copilot assistant.message
event carries the full text, read it there instead of accumulating deltas.
-
Minimal, Copilot-only: since each adapter has its own execute(), change
only Copilot's aggregation to concatenate directly:
text = "".join(e.content for e in chunks if e.type == "text")
Simpler, but note it concatenates distinct assistant messages within a turn with
no separator (acceptable given deltas have no inherent line breaks, but option 1
is cleaner and keeps all adapters consistent).
A regression test should assert that a Copilot delta stream like
["Hello", " ", "world"] reconstructs to "Hello world", not "Hello\n \nworld".
Summary
CopilotCLIAdapter.execute()reconstructs the final response text by joining everytype="text"StreamEvent with"\n". For Copilot those events are per-tokendeltas (
assistant.message_delta→deltaContent), so the joined result is theassistant's message exploded to one token/sub-word per line instead of the
original prose.
Affected version:
agent-shell-py==0.1.15.Symptom
AgentResponse.responsefor a Copilot run comes back as hundreds of near-emptylines. Reconstructing "Running a quick connectivity check to Forgetful" produces:
(Note
Forget+fularriving as two separate deltas — confirming these aretoken-level, not message-level, chunks.)
Root cause
Two lines, both in
agent_shell/adapters/copilot_cli_adapter.py:_parse_event()maps each per-token delta to atype="text"event (~L231-234):execute()aggregates those events with a newline separator (~L62):deltaContentis defined by the Copilot SDK as an incremental chunk to append(the canonical consumer is
stdout.write(delta)— direct concatenation, noseparator). Joining token deltas with
"\n"is therefore wrong for this adapter.Why this is Copilot-specific
The
"\n".join(...)line is shared boilerplate across all adapters(
claude_code_adapter.py,opencode_adapter.py,codex_adapter.py), but it is onlyharmful for Copilot:
type="text"events built from wholemessage/content blocks, so
"\n".joinjust separates distinct blocks — correct.type="text".pi_adapter.pyalready documents this exact trap and deliberately avoids it — itsurfaces text on the block-level
text_endevent, with the comment:Copilot's adapter didn't follow that convention.
Impact
AgentResponse.responseis unusable as prose for Copilot runs — any consumer thatprints or parses it gets a token-per-line wall of text.
print(response.response), fillingper-agent logs with token-per-line output (worst on Copilot).
Suggested fix
"\n".jointo"".joinglobally — that wouldregress the block-level adapters (distinct text blocks would run together with no
separation).
Two adapter-local options, in order of preference:
Coalesce deltas into whole-message text events (mirrors
pi_adapter): keepthe token deltas for the live
stream()path if desired, but emit a singleconsolidated
type="text"event per assistant message soexecute()'s"\n".joinstays correct for every adapter. If the Copilotassistant.messageevent carries the full text, read it there instead of accumulating deltas.
Minimal, Copilot-only: since each adapter has its own
execute(), changeonly Copilot's aggregation to concatenate directly:
Simpler, but note it concatenates distinct assistant messages within a turn with
no separator (acceptable given deltas have no inherent line breaks, but option 1
is cleaner and keeps all adapters consistent).
A regression test should assert that a Copilot delta stream like
["Hello", " ", "world"]reconstructs to"Hello world", not"Hello\n \nworld".