Skip to content

DEVmatrose/MantisClaw

Repository files navigation

WIP - MantisClaw — Autonomous Agent Loop Framework

Language: English | Deutsch | 中文

MantisClaw Overview

Version: 0.4.0 Status: IN DEVELOPMENT GitHub: DEVmatrose/MantisClaw
Author: @ogerly · DEVmatrose
Part of: Mantis Family
License: MIT

What is MantisClaw?

MantisClaw is a standalone agent-loop framework with emergent identity.

The Core Idea: Emergent Soul

soul(t) = f(base, agenda.resolve(account, social, decentral), working_context)

The soul is never written — it is computed at every tick. An agent who codes is different from one who trades, yet both share the same identity/base.md.


Architecture

MantisClaw/
│
├── .agent.json                 ← AAMS Bootstrap
├── AGENTS.md                   ← Tool-Bridge
├── READ-AGENT.md               ← Agent Contract
│
├── core/                       ← The Brain (L3) — pure loop
│   ├── runtime.py              ← Heartbeat-Loop (60s, Idle Detection)
│   ├── planner.py              ← Thinking (Tool-Injection + JSONL Prompt Logging)
│   ├── executor.py             ← Action (Fuzzy Action Matching)
│   ├── observer.py             ← Observing + Health-Tracking
│   ├── reflect.py              ← Reflection (RFL) — Self-Correction
│   ├── context.py              ← JIT Context Loading (3-Stage)
│   ├── skill_executor.py       ← Skill Orchestration (L5)
│   ├── llm.py                  ← LLM-Backend (L0)
│   ├── session.py              ← AAMS Session Management
│   ├── workpaper.py            ← Workpaper Management
│   ├── ltm.py                  ← Long-Term Memory
│   └── registry/               ← Tool-Registry (L4)
│       ├── __init__.py         ← ToolRegistry + Tool Classes
│       ├── registry.py         ← Whitelist, Security-Levels, Fuzzy-Resolve
│       └── tools/              ← Tool Implementations
│           ├── filesystem.py   ← read_file, write_file, append_file, list_dir, workspace_status
│           ├── memory.py       ← query_memory, log_diary
│           ├── analysis.py     ← analyze, summarize (LLM-powered)
│           ├── llm_management.py ← list_models, switch_model
│           └── loop_monitor.py ← loop_monitor, token_budget
│
├── identity/                   ← Emergent Identity (L1)
│   ├── base.md.example         ← Constants: Name, Ethics, Keys
│   ├── agenda.md.example       ← Root Node: Active Agenda
│   ├── account.md.example      ← Platform Access
│   ├── social.md.example       ← CRM-State: Contacts
│   ├── decentral.md.example    ← Trust-Map: Nodes
│   └── hook.md.example         ← Trigger Definitions
│
├── dashboard/                  ← Web-UI (FastAPI + SSE)
│   ├── app.py                  ← FastAPI routes + voice endpoints
│   ├── chat.py                 ← Chat logic + SSE streaming
│   ├── db.py                   ← SQLite (aiosqlite) persistence
│   ├── voice.py                ← Voice backend (TTS, STT, Action Classifier)
│   ├── templates/
│   │   └── index.html          ← Single-page dashboard (Jinja2)
│   └── static/
│       └── style.css           ← Dashboard styles
│
├── WORKSPACE/                  ← AAMS Body (L2)
│   └── WORKING/                ← Construction Memory
│       ├── WHITEPAPER/         ← Architectural Truth
│       ├── WORKPAPER/          ← Session Work
│       ├── MEMORY/             ← LTM (ltm-index.md)
│       ├── DIARY/              ← Decision Context
│       ├── GUIDELINES/         ← Procedural Memory
│       ├── SCIENCE/            ← Knowledge Validation
│       ├── LOGS/               ← Audit Trail (prompt_log.jsonl)
│       ├── PROJECT/            ← Project Definitions (project.yaml)
│       └── TOOLS/              ← Skills (Orchestration Recipes)
│           └── skills/         ← Markdown+YAML Workflows
│
├── data/                       ← Runtime data (voice_config.json)
│
└── config/                     ← Configuration

AAMS — The Body

AAMS (Autonomous Agent Manifest Specification) is a framework-independent standard for agentic work. AAMS is not a part of MantisClaw — it is an external, universal standard.

MantisClaw uses AAMS as a structured body (WORKSPACE/WORKING/). The entire WORKING structure enables structured labor on complex tasks — whether coding, planning, or organization.

What the AAMS structure provides:

  • Workpapers — Session work, one file per session
  • Whitepapers — Stable architectural truth
  • LTM — Long-term memory (ltm-index.md + optional ChromaDB)
  • Diary — Decision context (monthly files)
  • Guidelines — Procedural memory (learnable workflows)
  • SCIENCE — Knowledge validation (external research, hypotheses)
  • Skills — Orchestration recipes in TOOLS/skills/
  • Logs — Audit trail (prompt_log.jsonl, runtime metrics)
  • Project — Project definitions with milestones and status

AAMS-Standard: github.com/DEVmatrose/AAMS


The Loop

async def tick():
    # L1 — Compute emergent identity
    soul_t = compute_soul(base, agenda, accounts, social, decentral)
    
    # L4 — Load context (JIT 3-Stage)
    registry.execute("load_context_always", {})          # ~3k Tokens
    registry.execute("load_context_agenda", {agenda})    # ~8k Tokens
    
    # L3 — Thinking
    hooks = load("identity/hook.md")
    guidelines = registry.execute("read_guidelines", {task_type})
    plan = planner(soul_t, hooks, memory, guidelines)
    
    # L3 — Action (Tool or Skill)
    if plan.type == "skill":
        results = skill_executor.execute(plan.skill, context)
    else:
        results = registry.execute(plan.tool, plan.params)
    
    # L3 — Observation
    assessment = observer(results, diary, guidelines)
    
    # L3 — Reflection (RFL)
    if assessment.needs_revision:
        reflection = reflect(assessment, results, plan)
        plan = planner.revise(soul_t, reflection)
        results = executor(plan)  # Second attempt
    
    # L2 — Procedural Memory + LTM
    observer.extract_lessons(results)  # → GUIDELINES/
    ltm_update(results, assessment)

Layer Model

L0  LLM-Backend             → core/llm.py
L1  Identity                → identity/ (soul(t) calculation)
L2  AAMS Body               → WORKING/ (passive, access via tools only)
L3  Runtime / Loop          → core/ (planner, executor, observer, reflect)
L4  Tool-Registry           → core/registry/ (Whitelist, incl. body access)
L5  Skills + Voice          → WORKING/TOOLS/skills/ + dashboard/voice.py
L6  Security                → Cross-cutting (Security levels per tool)

In Mantis-OS, L7 (Network/MantisNostr) is added.

Core Rule: L3 (Loop) never touches L2 (Body) directly. Every access to WORKING/ runs through a registered tool in L4.


Quick Start

# 1. Clone Repository
git clone https://github.com/DEVmatrose/MantisClaw
cd MantisClaw

# 2. Python Virtual Environment
python -m venv .venv
source .venv/bin/activate  # Linux/Mac
.venv\Scripts\Activate.ps1  # Windows

# 3. Dependencies
pip install -r requirements.txt

# 4. Configuration
cp config/.env.example .env
# Edit .env:
#   - LLM_BACKEND=lmstudio (default) or ollama
#   - Optional: Cloud-Keys (OPENAI_API_KEY, ANTHROPIC_API_KEY)

# 5. Set up Identity
cp identity/base.md.example identity/base.md
cp identity/agenda.md.example identity/agenda.md
# Optional: account.md, social.md, decentral.md, hook.md

Start LLM-Backend

MantisClaw is local-first — a local LLM must be running before the agent starts:

# Option A: LM Studio (default, recommended)
# → Open LM Studio, load model, start server on localhost:1234

# Option B: Ollama
ollama serve                    # localhost:11434
ollama run qwen3-coder          # or another model

Start Agent-Loop (Headless)

python -m core.runtime

The runtime loop runs in the terminal and logs every tick:

12:00:00 [mantisclaw.runtime] INFO: MantisClaw starting...
12:00:00 [mantisclaw.runtime] INFO: Health: HEALTHY
12:00:00 [mantisclaw.runtime] INFO: Heartbeat: 60s
12:01:00 [mantisclaw.runtime] INFO: === TICK 1 ===
12:01:00 [mantisclaw.runtime] DEBUG: Soul computed. Agent: MantisClaw
12:01:00 [mantisclaw.runtime] INFO: Plan: ... (2 steps)

Stop with Ctrl+C. Automatically creates a workpaper in WORKSPACE/WORKING/WORKPAPER/.

Start Dashboard (Web-UI)

uvicorn dashboard.app:app --reload --port 8080

Open http://localhost:8080 — the dashboard displays:

┌──────────────┬──────────────────────────────────┬───────────────┐
│  ASSISTANT   │                                  │ R1 Project    │
│  MantisClaw  │        Chat with the Agent       │ R2 Workpaper  │
│  Voice-Chat  │        (SSE-Streaming)           │ R3 WORKING    │
│  Event-Feed  │                                  │ R4 Chat-Hist. │
├──────────────┴──────────────────────────────────┴───────────────┤
│ 🟢 [Backend ▼] [Model ▼] │ Identity │ Runtime │ Tools │ Voice  │
└─────────────────────────────────────────────────────────────────┘
  • Left (Assistant): Mantis Voice-Assistent — Voice-Chat-Log (VAD + TTS/STT + Action Classifier), Event-Feed (runtime ticks, errors, warnings)
  • Center: Chat interface with SSE streaming (token-by-token)
  • Right: R1 Runtime Monitor (live loop status), R2 Project Overview, R3 Token Budget
  • Footer Controls: Backend/Model Switcher, Identity Inspector, Events, Prompt Inspector, Voice Toggle
  • Header: Project selector dropdown (switches active project context)

Hybrid Architecture (Option C)

MantisClaw uses a hybrid UI approach:

Surface Purpose
VSCodium / VS Code Primary workspace — Code, Files, Terminal, Git, Chat (via Continue.dev)
Dashboard (:8080) Companion — Voice Assistant, Runtime Monitor, Identity Inspector

The dashboard provides what VS Code cannot: voice interaction (Web Audio API), live runtime monitoring, and identity inspection. Everything else (file editing, terminal, git, code chat) belongs in VS Code.

Continue.dev connects to MantisClaw via an OpenAI-compatible /v1/chat/completions endpoint. Every response includes soul(t) context — the agent's personality flows into code assistance.

Hybrid Architecture (Option C)

ManTwo Operating Modes

MantisClaw supports two modes of operation:

Mode Trigger Behavior
Loop Mode (autonomous) Heartbeat every 60s Plan → Execute → Observe → Reflect → Idle
Chat Mode (reactive) HTTP Request (Continue.dev, Dashboard) Request → soul(t) → LLM → Response

Both modes share the same LLM backend (serialized via async lock to prevent collisions).


tisClaw uses a hybrid UI approach:

Surface Purpose
VSCodium / VS Code Primary workspace — Code, Files, Terminal, Git, Chat (via Continue.dev)
Dashboard (:8080) Companion — Voice Assistant, Runtime Monitor, Identity Inspector

The dashboard provides what VS Code cannot: voice interaction (Web Audio API), live runtime monitoring, and identity inspection. Everything else (file editing, terminal, git, code chat) belongs in VS Code.

Continue.dev connects to MantisClaw via an OpenAI-compatible /v1/chat/completions endpoint. Every response includes soul(t) context — the agent's personality flows into code assistance.


Two Operating Modes

MantisClaw supports two modes of operation:

Mode Trigger Behavior
Loop Mode (autonomous) Heartbeat every 60s Plan → Execute → Observe → Reflect → Idle
Chat Mode (reactive) HTTP Request (Continue.dev, Dashboard) Request → soul(t) → LLM → Response

Both modes share the same LLM backend (serialized via async lock to prevent collisions).


Voice Assistant — The Main Agent

Mantis is not just a UI element — Mantis is the main agent. The dashboard is its face, voice its mouth, the runtime loop its brain. Voice-first is the primary interaction mode.

The assistant lives in the left sidebar and processes speech through a multi-stage pipeline:

Mic → VAD → STT (faster-whisper) → Intent Classifier → Handler → LLM → TTS (edge-tts) → Speaker

Focus Hierarchy

The assistant never loses focus:

  1. Workpaper — Active task, next steps
  2. Project — Milestones, scope, status
  3. Self — Health, runtime, tools

Intent Classification

Every voice input is classified by the LLM into one of three intents:

Intent Action Example
IDENTITY Update assistant name, voice, personality "Nenn dich Mantes"
SYSTEM Query project state, workpapers, runtime "In welchem Projekt sind wir?"
CHAT Normal conversation "Wie geht es dir?"

Voice Config

The assistant identity is persisted in data/voice_config.json:

  • name — Assistant name (default: "MantisClaw")
  • voice_type — male / female → selects TTS voice
  • personality — Personality description
  • speech_speed — fast / normal / slow

API Endpoints

Endpoint Method Purpose
/api/tts POST Text-to-Speech (edge-tts)
/api/stt POST Speech-to-Text (faster-whisper)
/api/voice/config GET/POST Read/update voice config
/api/voice/greeting GET Context-aware greeting
/api/voice/talk POST Full pipeline: STT → classify → handle → respond → TTS
/api/voice/history GET Voice conversation history
/api/voice/clear POST Clear voice history

Standalone

Aspect MantisClaw
AAMS ✅ Uses AAMS as body (external standard)
Runtime ✅ Native heartbeat loop
Identity ✅ All files local in identity/
LLM Backend ✅ LM Studio (default) / Ollama / Cloud optional
Dashboard ✅ Web-UI on localhost:8080 (FastAPI + SSE + Live Tick Feed)
Voice Assistant ✅ Voice-first Main Agent (TTS/STT, VAD, Intent Classification, Event Narration)
Idle Detection ✅ Identical plans skipped after 3 repetitions
Prompt Logging ✅ JSONL-based, inspectable via Dashboard
Deployment ✅ Single repo, runs standalone

The code is identical to MantisClaw in Mantis-OS — only the integration differs.


Runtime Efficiency

The autonomous loop consumes LLM tokens at every tick. Without countermeasures, a "hamster wheel" effect can occur — identical plans are repeated endlessly, every step produces errors, and the analyze tool generates long explanations for non-existent paths.

Countermeasures (Implemented)

Problem Solution
10s Heartbeat too aggressive Heartbeat increased to 60s (config/default.yaml)
Identical plans in loop Idle Detection: Plan signature is hashed; after 3 identical plans, execution is skipped
No memory between ticks Last 3 tick summaries injected into Planner context with "DO NOT repeat!"
LLM invents file paths _validate_path() strips hallucinated prefixes (WORKSPACE/, ./WORKSPACE/WORKING/); tool descriptions provide correct example paths
LLM uses WORKSPACE/ instead of WORKING/ workspace_status returns paths with WORKING/ prefix; Planner rule: "NEVER use WORKSPACE/ as prefix"
LLM ignores response format Explicit format instruction + FORBIDDEN: block (no XML tags) + _parse_plan() handles <Reasoning> gracefully
LLM responses too long (2000+ tokens) Planner max 500 tokens, Analyze max 300 tokens, Summarize max 200 tokens
Analyze explains errors endlessly Token limit + corrected paths → fewer errors → fewer explanations

Monitoring

  • Prompt Logging: Every planner call is saved as JSONL in WORKSPACE/WORKING/LOGS/prompt_log.jsonl
  • **PromptWORKSPACE/WORKING/WHITEPAPER/CORE.md) | Runtime & Loop — the Brain | | WH-IDENTITY | Emergent Identity — soul(t) | | WH-WORKING | AAMS Body — the Body | | WH-TOOLS | Tool Registry, Skills & Body Interface | | WH-PROJECT | Project Definitions & Milestones | | WH-DASHBOARD | Dashboard Style Guide & Layout | | WH-ASSISTANT | Voice Assistant Architecture

Integration in Mantis-OS

MantisClaw can run alone — or as part of Mantis-OS (Autonomous Agent Operating System):

Mantis-OS (Full Agent Node)
  ├── MantisClaw (Brain + Identity + AAMS Body)   ← This repository
  └── MantisNostr (Mesh Network)

Link: https://github.com/DEVmatrose/Mantis-OS


Documentation

Whitepapers

Whitepaper Content
WH-CORE Runtime & Loop — the Brain
WH-IDENTITY Emergent Identity — soul(t)
WH-WORKING AAMS Body — the Body
WH-TOOLS Tool Registry, Skills & Body Interface
WH-PROJECT Project Definitions & Milestones
WH-DASHBOARD Dashboard Style Guide & Layout
WH-ASSISTANT Voice Assistant Architecture

See WORKSPACE/WORKING/WORKPAPER/ for active session work.


License

MIT

About

Agent-Loop-Framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors