Skip to content

Extract gemma-cli's 17 tools into a shared MCP layer (agy/Flash + Gemma fallback, no duplication) #73

Description

@JoshuaVSherman

Goal

Make Gemma's tool-bound capabilities runnable end-to-end by agy/Gemini Flash, without duplicating any tool code and without losing Gemma as a fallback. Achieved by extracting gemma-cli's tools into one shared, model-agnostic layer that agy, Gemma, and (optionally) Claude Code all consume.

Captured from 2026-06-14 discussion. Companion to #72 (copywriting routing). Supersedes the earlier under-scoped version of this issue (which listed only ~4 tools and assumed copying them into agy).

Epic — staged execution (opus)

Built as three tight, independently-verifiable Opus PRs (de-risks the Gemma fallback first):

Full inventory — 17 registered tools across 6 modules

(Source of truth: web-jam-tools/gemma-cli/gemma_cli/tools/, the Tool(name=…) registrations — not the task-queue snapshot.)

Module Tools
drive.py (8) drive_read_text_file, drive_read_text_file_lines, drive_search_in_file, drive_update_text_file, drive_list_files, drive_trash_file, drive_move_file, drive_create_text_file
calendar.py (2) calendar_list_events, calendar_create_event (mandatory conflict check)
gmail.py (2) gmail_draft_email, gmail_search
templates.py (1) generate_venue_email_from_template
venue_contacts.py (3) lookup_venue_contact, lookup_venue_email_on_web, update_venue_contact
memory.py (1) remember_fact (appends to GEMMA.md in Drive)

(drive.py also has internal PDF helpers download_pdf_text / find_pdf_by_name — not separately registered.)

Harness layers — what's portable vs Gemma-specific

Analyzed the whole harness (cli.py 2564 ln, llm.py 575 ln, auth.py, guards.py, memory.py, queue.py):

  • tools/*.py — SHAREABLE. The tool logic is model-agnostic (googleapiclient, openpyxl, bs4, requests). Its only coupling to gemma-cli is import paths: the Tool descriptor (a generic function-schema dataclass, not Ollama-specific), auth.load_credentials, and guards. No model logic inside the tools.
  • guards.py — SHARE. File-discipline + protected-file policy (no version suffixes; protected Drive IDs incl. the RSVP MASTER and the canonical task queues). Model-agnostic safety; belongs in the shared core.
  • auth.py — SHARE (with creds relocation). Reuses the existing google-drive-mcp + gmail-mcp OAuth tokens (scopes: drive, calendar, documents, spreadsheets, gmail). Local-file reads today; reuse the Deploy daily-devotional generator to a scheduled cloud runtime (laptop-independent) #69 secret-store pattern if it ever runs off-laptop.
  • llm.py — DO NOT PORT. It's the Ollama chat loop plus a large body of guard-rails built specifically for gemma's unreliable tool-calling (inline-JSON detection, consecutive/cyclic loop guards, leaked-template-token aborts, line-repetition aborts). agy/Flash uses Antigravity's own native tool loop and doesn't need these. This stays with gemma-cli for the fallback.
  • cli.py — DO NOT PORT. gemma-cli's REPL, system prompt, smalltalk/email-approval gates, /next dispatch. agy brings its own REPL. Stays as Gemma's harness.

Takeaway: the valuable, reusable asset is ~17 clean tool functions + guards + auth. The bulk of "we spent a lot of time" (llm.py + cli.py) is gemma-specific scaffolding we keep for the fallback, not something to move.

Architecture (reusability-first — addresses "no duplication" + "keep Gemma as fallback")

Extract the shared tool core into one Python MCP server — single source of truth, multiple consumers, zero duplicated tool code:

        webjam-tools-core (MCP server)
        = 17 tools + guards.py + auth.py
                 ▲        ▲          ▲
                 │        │          │
            agy/Flash  gemma-cli   Claude Code
           (mcp_config) (fallback)  (optional)
  • agy/Flash consumes it via ~/.gemini/config/mcp_config.json (Antigravity custom-MCP support — confirmed). Flash can then complete a task end-to-end (write copy AND call the tool).
  • gemma-cli stays as fallback — keeps its Ollama loop + guard-rails, but sources the tools from the shared core instead of owning them (dependency inversion: gemma-cli becomes a consumer).
  • Claude Code could also point at the same MCP server.

This is the opposite of the original "copy tools into agy" framing — one implementation, shared, which is the explicit ask.

Refactor shape

  1. Lift tools/*.py, guards.py, auth.py, and the Tool descriptor into a standalone package exposed as an MCP server (webjam-tools-core or similar).
  2. Re-point gemma-cli to import the tools from that core (so Gemma keeps working as fallback, no duplicate definitions).
  3. Add the agy mcp_config.json entry pointing at the server.

Build-time verifications

  • Antigravity supports custom Python MCP servers via ~/.gemini/config/mcp_config.json (mcpServers: command/args/env). Confirmed.
  • Confirm the MCP server can reuse the existing drive-mcp/gmail-mcp tokens unchanged (and the Deploy daily-devotional generator to a scheduled cloud runtime (laptop-independent) #69 secret-store path if ever off-laptop).
  • Confirm gemma-cli still passes its guard-rail behavior after sourcing tools from the extracted core (no regression in the fallback path).
  • Decide remember_fact target when called outside gemma (GEMMA.md vs a model-neutral memory file).

Acceptance criteria

  • All 17 tools exposed via one shared MCP serverno tool logic duplicated across agy and Gemma.
  • agy/Flash can complete a gig-promo task end-to-end (copy + tool action) in one session.
  • Gemma still works as a fallback, consuming the same shared core.
  • No dependency on the OMEN desktop for the agy path.
  • Human-in-the-loop preserved (guards.py protected-file checks; gmail drafts, never auto-send).
  • Credentials via the existing token store (no desktop-local-only coupling for the agy path).

Token-spend rationale (decided)

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    LowPriority: lowenhancementNew feature or requestopusCodework executes via Claude Opus (claude-opus-tasks.txt lane)

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions