Skip to content

Test fixture pollution: send_teams_message tests in test_watch.py write to live interaction log #22

@brandwe

Description

@brandwe

Symptom

Today's ~/.entrabot/data/interactions/2026-06-09.jsonl contains four entries that are clearly test-fixture artifacts, not real interactions:

{"recipient": "c1", "summary": "hello",      "content_ref": "msg-1",      "ts": "2026-06-09T21:52:38.387939+00:00", "channel": "teams_unknown", "action": "send_teams_message"}
{"recipient": "c1", "summary": "hello team", "content_ref": "msg-pfx",    "ts": "2026-06-09T21:52:38.461622+00:00", "channel": "teams_unknown", "action": "send_teams_message"}
{"recipient": "c1", "summary": "hello team", "content_ref": "msg-no-pfx", "ts": "2026-06-09T21:52:38.539766+00:00", "channel": "teams_unknown", "action": "send_teams_message"}

(Plus at least one more variant in bootstrap_body_state()'s top_chats_today output — chat_id="c1" ranks #2 with 24 interactions for today, ahead of real chats. Operational analytics get polluted until the day rolls over.)

These were written when the test suite ran post-merge for PR #21 (today around 21:52 UTC, well after the day's real activity started).

Root cause

tests/tools/test_watch.py has at least two tests in the send_teams_message suite (around lines 320-422, e.g. test_send_prefixes_in_delegated_mode, test_send_no_prefix_in_agent_user_mode) that:

  1. Mock the Graph API via respx.post(f"{GRAPH_BASE}/chats/c1/messages").mock(...)
  2. Mock token acquisition via patch("entrabot.mcp_server.acquire_agent_user_token", ...)
  3. Save and restore mcp_server._state
  4. Do NOT isolate log_interaction() writes. mcp_server._state["config"] is set to a bare MagicMock(), with no monkeypatching of config.data_dir.

The production send_teams_message path calls _log_interaction_safe(...) as a side-effect. Inside interaction_log.py, the write target is computed from config.data_dir / "interactions" / "<day>.jsonl". When config is a MagicMock, config.data_dir is a MagicMock attribute too — which suggests writes should be no-op. The fact that they hit the real disk means either:

  • _log_interaction_safe has a fallback to Path.home() / ".entrabot" / "data" / "interactions" when config is missing/wrong-shape, OR
  • The MagicMock resolves to a path that defaults to the user's data dir via some env var fallback, OR
  • One of these tests sets a real config without monkeypatching

The exact mechanism needs a 5-minute walk through interaction_log.py's path resolution to nail down — but the fix shape is clear either way.

Proposed fix

For the test side (the cheaper change):

import pytest
from entrabot.config import Config

@pytest.fixture
def isolated_data_dir(monkeypatch, tmp_path):
    """Redirect entrabot's data_dir to a tmp path so writes don't pollute ~/.entrabot."""
    monkeypatch.setattr("entrabot.tools.interaction_log.DATA_DIR_OVERRIDE", tmp_path, raising=False)
    # Or, if config is the path: build a real Config pointed at tmp_path and use it
    return tmp_path

Apply this fixture (or equivalent monkeypatching of mcp_server._state["config"].data_dir) to every test that invokes a send_*, read_*, or audit_log MCP tool. Audit: grep all tests that exercise tools listed in _log_interaction_safe's call sites in mcp_server.py and ensure each test isolates data_dir.

For the production side (defense in depth, optional):

_log_interaction_safe could refuse to write when it detects a MagicMock-ish config shape, OR — better — when os.environ.get("PYTEST_CURRENT_TEST") is set, default the log target to tempfile.mkdtemp() unless an explicit override is in place. That way, even a sloppy future test can't leak by accident.

Cleanup of today's polluted log

The four polluted entries are visually identifiable (recipient "c1", content_refs msg-1 / msg-pfx / msg-no-pfx, all clustered at 21:52:38). A small jq script can strip them:

jq -c 'select(.recipient != "c1")' ~/.entrabot/data/interactions/2026-06-09.jsonl > /tmp/cleaned.jsonl
mv /tmp/cleaned.jsonl ~/.entrabot/data/interactions/2026-06-09.jsonl

Or leave them — the day rolls at UTC midnight, and the next bootstrap_body_state will show a clean state tomorrow. The pollution is one-day, not durable.

Repro

# Inspect current state
jq -c 'select(.recipient == "c1")' ~/.entrabot/data/interactions/2026-06-09.jsonl

# Or, re-run the suite to add MORE leaked entries
.venv/bin/pytest tests/tools/test_watch.py -v -k "prefix or no_prefix"

Each test run appends three more c1/hello/hello team entries to the live interactions log.

Related

Out of scope

  • Bigger test-isolation overhaul (some tests already use tmp_path correctly; this issue is specifically the send_teams_message family in test_watch.py).
  • Adding test-collection hooks that auto-isolate every interaction write. Worth considering as a follow-up if pollution keeps recurring after this fix, but not the right v1 shape.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions