feat: add read_interactions + bootstrap_body_state for body-side observation (#20) by brandwe · Pull Request #21 · microsoft/entrabot

brandwe · 2026-06-09T21:41:36Z

Closes #20.

Summary

Three additions giving the model a read path into entrabot's operational storage so it can observe its own history during a turn — the body-side analogue of persona-sati's observe discipline.

read_interactions() MCP tool — new module src/entrabot/tools/read_interactions.py. Chronological filter over interactions/<day>.jsonl via MemoryBackend (works on both LocalBackend and BlobBackend). Filters: chat_id, sender (case-insensitive), action, direction, since (default = now − 24 h), limit (default 10). Default window scans today + yesterday; since can reach back up to 7 days (hard cap, logged when hit). No embeddings, no scoring, no caching — JSONL re-reads are cheap (sub-10 ms in the common case).
bootstrap_body_state() MCP tool — new module src/entrabot/tools/body_bootstrap.py. Single packet of today's today_counts (total / inbound / outbound / by_action / by_channel), top_chats_today (up to 5, ties broken by recency), open_promises (ALL open, not top-N — commitments are durable), cursor_freshness (watched_chat_count / cursors_present / cursors_stale / oldest+newest cursor_ts), watched_chat_count, generated_at. Index only — no message summaries leak into the payload; full content stays in read_interactions.
Body-prompt rule in prompts/anatomy/identity-and-tools.md — pre-send observe scoped to outbound publishing (send_teams_message, send_email, send_card, share_file). Reads, lists, and audit entries do not need it. Same cheap-not-precious posture as persona-sati's observe.

Test plan

.venv/bin/pytest -v --tb=short → 1339 passed (was 1281; +58 from new tests), 1 skipped (pre-existing), 8 warnings (all pre-existing _background_poll in chat_cursors tests, unrelated)
.venv/bin/ruff check . → All checks passed
.venv/bin/pytest tests/tools/test_daily_summary*.py -v → all passed (explicit regression check — read_day() write path untouched)

New tests:

tests/tools/test_read_interactions.py — 26 tests: filter logic (chat_id including outbound-recipient + inbound-metadata-chat_id paths, sender case-insensitivity, action, direction); since-cutoff including day-boundary crossover; 7-day cap; limit honored; sort order most-recent-first; missing day file handled; corrupt line skipped; all filters compose.
tests/tools/test_body_bootstrap.py — 23 tests: empty-state sensible zeros; today_counts correctness; top_chats sorting + tie-break by recency + 5-cap + excludes entries without chat_id; all-open-promises (not top-N); cursor freshness picks up stale vs fresh + oldest/newest timestamps; watched_chat_count from persisted file; INDEX-only invariant (full summaries never leak into bootstrap payload).
tests/test_mcp_server_body_tools.py — 7 tests: both tools registered with FastMCP, callable via mcp._tool_manager._tools, return JSON strings (matches read_email / list_promises convention), validation errors come back as {"error": "..."}.

Design choices made within latitude

7-day cap behavior: since reaching further back than 7 days is silently capped to a 7-day file scan; a warning is logged at the call site. Going deeper requires a follow-up change to raise the cap intentionally — the cost of unbounded JSONL scans is real, the cost of a follow-up is small.
Missing day file: treated as zero entries for that day (matches read_day()'s existing raw is None → [] behavior). No error, no warning.
chat_id filter semantics: matches recipient for outbound and metadata.chat_id for inbound — mirrors daily_summary._counterparty(). Consistent with how the existing log writers populate the schema.
bootstrap_body_state() open_promises is ALL, not top-N: promises are durable commitments. Capping them would hide work the model owes humans.
Cursor freshness staleness threshold: uses the canonical chat_cursors.is_stale() (24 h cap). Defining a second threshold here would risk drift.

Out of scope (per issue)

Read-tool logging (whether read_teams_messages etc. should be in the interaction log) — separate decision.
Semantic scoring / embeddings — chronological is enough for v1.
Cross-day backfill beyond 7 days — explicit follow-up if needed.
VERSION / CHANGELOG bumps — release-time scope.

Co-authored-by: Copilot 223556219+Copilot@users.noreply.github.com

…rvation (#20) Two new MCP tools and one body-prompt rule give the model a read path into its own operational storage: - read_interactions(): chronological filter over interactions/<day>.jsonl via MemoryBackend. Filters: chat_id, sender, action, direction, since, limit. Default window today + yesterday; up to 7 days when since reaches back further. - bootstrap_body_state(): single-packet index of today's counts, top chats, all open promises, and watched-chat cursor freshness. Mirrors persona-sati's bootstrap_session shape. Index only — full content stays in read_interactions. - prompts/anatomy/identity-and-tools.md: pre-outbound-send observe rule scoped to send_teams_message / send_email / send_card / share_file. Read path is purely additive; interaction_log.py write path and the on-disk JSONL schema are unchanged. daily_summary regression check green (38 tests). Closes #20. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds body-side “observation” tooling so the model can read entrabot’s own operational storage (interaction log + a compact bootstrap index) during a turn, reducing unnecessary Graph calls and improving continuity.

Changes:

Introduces a read_interactions() implementation that filters interactions/<day>.jsonl chronologically with structured filters and a bounded multi-day scan window.
Introduces bootstrap_body_state() to return a single index packet (counts, top chats, open promises, cursor freshness, watched chat count) suitable for session-start context.
Exposes both tools via FastMCP and documents the outbound pre-send “observe” discipline in the body prompt.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
tests/tools/test_read_interactions.py	Adds unit tests for interaction-log filtering behavior and edge cases.
tests/tools/test_body_bootstrap.py	Adds unit tests for bootstrap index packet contents and invariants (index-only).
tests/test_mcp_server_body_tools.py	Verifies both tools are registered in FastMCP and return JSON strings via MCP.
src/entrabot/tools/read_interactions.py	Implements bounded, structured, chronological reads over the interaction JSONL log.
src/entrabot/tools/body_bootstrap.py	Implements session-start body bootstrap packet (index-only operational snapshot).
src/entrabot/mcp_server.py	Registers new MCP tools `read_interactions` and `bootstrap_body_state`.
prompts/anatomy/identity-and-tools.md	Documents the new tools and the pre-send body-side observation discipline.

brandwe · 2026-06-09T21:52:57Z

+    if dt.tzinfo is None:
+        dt = dt.replace(tzinfo=UTC)
+    return dt


Verified — bug reproduces. Constructed a regression: since of cutoff_utc.astimezone(timezone(timedelta(hours=12))) (where cutoff_in_offset.date() > cutoff_utc.date()) with an entry 3 days back at 22:00 UTC. Old code skipped the entry's UTC day file and returned []. Fixed in 8729251 by normalizing cutoff and now to UTC inside _days_to_scan before extracting .date(). Regression test in tests/tools/test_read_interactions.py::TestSinceFilter::test_since_with_non_utc_offset_scans_correct_utc_day.

brandwe · 2026-06-09T21:52:58Z

+        last = datetime.fromisoformat(top["last_activity"].replace("Z", "+00:00"))
+        earlier = datetime.fromisoformat(
+            (result["top_chats_today"][0]["last_activity"]).replace("Z", "+00:00")
+        )
+        assert last == earlier  # sanity


Verified — last == earlier was parsing top["last_activity"] twice and comparing it to itself. Replaced with: capture earlier_ts / latest_ts before logging, then assert last_activity == latest_ts AND last_activity > earlier_ts. Now actually validates the recency-selection branch in _top_chats. Fixed in 8729251.

- _days_to_scan now normalizes cutoff + now to UTC before extracting calendar dates. A since with a non-UTC offset whose offset-local date differed from its UTC date would shift cutoff.date() and skip the earliest required UTC day file (silently losing matching entries). Regression test constructs a +12:00 since that exposes the bug — fails on old code, passes on new. - Replace tautological last_activity assertion in test_includes_last_activity_and_last_sender with a real recency check: parse the actual newer ts and assert last_activity equals it AND is strictly greater than the older entry. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

brandwe requested a review from Copilot June 9, 2026 21:44

Copilot started reviewing on behalf of brandwe June 9, 2026 21:44 View session

Copilot AI reviewed Jun 9, 2026

View reviewed changes

brandwe merged commit 8cb9760 into main Jun 9, 2026
5 of 9 checks passed

brandwe mentioned this pull request Jun 9, 2026

Test fixture pollution: send_teams_message tests in test_watch.py write to live interaction log #22

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add read_interactions + bootstrap_body_state for body-side observation (#20)#21

feat: add read_interactions + bootstrap_body_state for body-side observation (#20)#21
brandwe merged 2 commits into
mainfrom
feat/read-interactions-tool

brandwe commented Jun 9, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

brandwe Jun 9, 2026

Uh oh!

brandwe Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

brandwe commented Jun 9, 2026

Summary

Test plan

Design choices made within latitude

Out of scope (per issue)

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

brandwe Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

brandwe Jun 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants