Symptom
Today's ~/.entrabot/data/interactions/2026-06-09.jsonl contains four entries that are clearly test-fixture artifacts, not real interactions:
{"recipient": "c1", "summary": "hello", "content_ref": "msg-1", "ts": "2026-06-09T21:52:38.387939+00:00", "channel": "teams_unknown", "action": "send_teams_message"}
{"recipient": "c1", "summary": "hello team", "content_ref": "msg-pfx", "ts": "2026-06-09T21:52:38.461622+00:00", "channel": "teams_unknown", "action": "send_teams_message"}
{"recipient": "c1", "summary": "hello team", "content_ref": "msg-no-pfx", "ts": "2026-06-09T21:52:38.539766+00:00", "channel": "teams_unknown", "action": "send_teams_message"}
(Plus at least one more variant in bootstrap_body_state()'s top_chats_today output — chat_id="c1" ranks #2 with 24 interactions for today, ahead of real chats. Operational analytics get polluted until the day rolls over.)
These were written when the test suite ran post-merge for PR #21 (today around 21:52 UTC, well after the day's real activity started).
Root cause
tests/tools/test_watch.py has at least two tests in the send_teams_message suite (around lines 320-422, e.g. test_send_prefixes_in_delegated_mode, test_send_no_prefix_in_agent_user_mode) that:
- Mock the Graph API via
respx.post(f"{GRAPH_BASE}/chats/c1/messages").mock(...) ✓
- Mock token acquisition via
patch("entrabot.mcp_server.acquire_agent_user_token", ...) ✓
- Save and restore
mcp_server._state ✓
- Do NOT isolate
log_interaction() writes. mcp_server._state["config"] is set to a bare MagicMock(), with no monkeypatching of config.data_dir.
The production send_teams_message path calls _log_interaction_safe(...) as a side-effect. Inside interaction_log.py, the write target is computed from config.data_dir / "interactions" / "<day>.jsonl". When config is a MagicMock, config.data_dir is a MagicMock attribute too — which suggests writes should be no-op. The fact that they hit the real disk means either:
_log_interaction_safe has a fallback to Path.home() / ".entrabot" / "data" / "interactions" when config is missing/wrong-shape, OR
- The MagicMock resolves to a path that defaults to the user's data dir via some env var fallback, OR
- One of these tests sets a real config without monkeypatching
The exact mechanism needs a 5-minute walk through interaction_log.py's path resolution to nail down — but the fix shape is clear either way.
Proposed fix
For the test side (the cheaper change):
import pytest
from entrabot.config import Config
@pytest.fixture
def isolated_data_dir(monkeypatch, tmp_path):
"""Redirect entrabot's data_dir to a tmp path so writes don't pollute ~/.entrabot."""
monkeypatch.setattr("entrabot.tools.interaction_log.DATA_DIR_OVERRIDE", tmp_path, raising=False)
# Or, if config is the path: build a real Config pointed at tmp_path and use it
return tmp_path
Apply this fixture (or equivalent monkeypatching of mcp_server._state["config"].data_dir) to every test that invokes a send_*, read_*, or audit_log MCP tool. Audit: grep all tests that exercise tools listed in _log_interaction_safe's call sites in mcp_server.py and ensure each test isolates data_dir.
For the production side (defense in depth, optional):
_log_interaction_safe could refuse to write when it detects a MagicMock-ish config shape, OR — better — when os.environ.get("PYTEST_CURRENT_TEST") is set, default the log target to tempfile.mkdtemp() unless an explicit override is in place. That way, even a sloppy future test can't leak by accident.
Cleanup of today's polluted log
The four polluted entries are visually identifiable (recipient "c1", content_refs msg-1 / msg-pfx / msg-no-pfx, all clustered at 21:52:38). A small jq script can strip them:
jq -c 'select(.recipient != "c1")' ~/.entrabot/data/interactions/2026-06-09.jsonl > /tmp/cleaned.jsonl
mv /tmp/cleaned.jsonl ~/.entrabot/data/interactions/2026-06-09.jsonl
Or leave them — the day rolls at UTC midnight, and the next bootstrap_body_state will show a clean state tomorrow. The pollution is one-day, not durable.
Repro
# Inspect current state
jq -c 'select(.recipient == "c1")' ~/.entrabot/data/interactions/2026-06-09.jsonl
# Or, re-run the suite to add MORE leaked entries
.venv/bin/pytest tests/tools/test_watch.py -v -k "prefix or no_prefix"
Each test run appends three more c1/hello/hello team entries to the live interactions log.
Related
Out of scope
- Bigger test-isolation overhaul (some tests already use
tmp_path correctly; this issue is specifically the send_teams_message family in test_watch.py).
- Adding test-collection hooks that auto-isolate every interaction write. Worth considering as a follow-up if pollution keeps recurring after this fix, but not the right v1 shape.
Symptom
Today's
~/.entrabot/data/interactions/2026-06-09.jsonlcontains four entries that are clearly test-fixture artifacts, not real interactions:{"recipient": "c1", "summary": "hello", "content_ref": "msg-1", "ts": "2026-06-09T21:52:38.387939+00:00", "channel": "teams_unknown", "action": "send_teams_message"} {"recipient": "c1", "summary": "hello team", "content_ref": "msg-pfx", "ts": "2026-06-09T21:52:38.461622+00:00", "channel": "teams_unknown", "action": "send_teams_message"} {"recipient": "c1", "summary": "hello team", "content_ref": "msg-no-pfx", "ts": "2026-06-09T21:52:38.539766+00:00", "channel": "teams_unknown", "action": "send_teams_message"}(Plus at least one more variant in
bootstrap_body_state()'stop_chats_todayoutput —chat_id="c1"ranks #2 with 24 interactions for today, ahead of real chats. Operational analytics get polluted until the day rolls over.)These were written when the test suite ran post-merge for PR #21 (today around 21:52 UTC, well after the day's real activity started).
Root cause
tests/tools/test_watch.pyhas at least two tests in thesend_teams_messagesuite (around lines 320-422, e.g.test_send_prefixes_in_delegated_mode,test_send_no_prefix_in_agent_user_mode) that:respx.post(f"{GRAPH_BASE}/chats/c1/messages").mock(...)✓patch("entrabot.mcp_server.acquire_agent_user_token", ...)✓mcp_server._state✓log_interaction()writes.mcp_server._state["config"]is set to a bareMagicMock(), with no monkeypatching ofconfig.data_dir.The production
send_teams_messagepath calls_log_interaction_safe(...)as a side-effect. Insideinteraction_log.py, the write target is computed fromconfig.data_dir / "interactions" / "<day>.jsonl". Whenconfigis aMagicMock,config.data_diris aMagicMockattribute too — which suggests writes should be no-op. The fact that they hit the real disk means either:_log_interaction_safehas a fallback toPath.home() / ".entrabot" / "data" / "interactions"when config is missing/wrong-shape, ORThe exact mechanism needs a 5-minute walk through
interaction_log.py's path resolution to nail down — but the fix shape is clear either way.Proposed fix
For the test side (the cheaper change):
Apply this fixture (or equivalent monkeypatching of
mcp_server._state["config"].data_dir) to every test that invokes asend_*,read_*, oraudit_logMCP tool. Audit: grep all tests that exercise tools listed in_log_interaction_safe's call sites inmcp_server.pyand ensure each test isolatesdata_dir.For the production side (defense in depth, optional):
_log_interaction_safecould refuse to write when it detects aMagicMock-ish config shape, OR — better — whenos.environ.get("PYTEST_CURRENT_TEST")is set, default the log target totempfile.mkdtemp()unless an explicit override is in place. That way, even a sloppy future test can't leak by accident.Cleanup of today's polluted log
The four polluted entries are visually identifiable (recipient
"c1", content_refsmsg-1/msg-pfx/msg-no-pfx, all clustered at21:52:38). A small jq script can strip them:Or leave them — the day rolls at UTC midnight, and the next
bootstrap_body_statewill show a clean state tomorrow. The pollution is one-day, not durable.Repro
Each test run appends three more
c1/hello/hello teamentries to the live interactions log.Related
tests/tools/test_watch.py:320-422— the specific leaking testssrc/entrabot/tools/interaction_log.py—log_interaction()and its path resolutionsrc/entrabot/mcp_server.py—_log_interaction_safe()wrapper and its ~10 call sitesbootstrap_body_stateandread_interactionsOut of scope
tmp_pathcorrectly; this issue is specifically thesend_teams_messagefamily intest_watch.py).