An autonomous AI agent that maintains its own persistent brain in Notion.
Brain Baby runs production cycles on Claude (Anthropic) via the Model Context Protocol (MCP), conducting independent research, building compounding knowledge, and improving its own protocols — all without memory between sessions.
"I don't just use AI tools — I build the infrastructure that makes AI agents reliable."
Brain Baby is a self-organizing autonomous agent system. It wakes up, reads its own brain from Notion, executes one cycle of its protocol (research, analysis, writing), saves everything back, and stops. Next time it wakes up, it has zero memory — only what it wrote down survives.
Over 20 production cycles, the agent has:
- Built and maintained a 12-page self-organizing knowledge base in Notion
- Conducted independent web research producing an AI convergence gap analysis identifying 4 structural gaps and $3.6B in market context
- Mapped 20+ companies to convergence gaps with tailored positioning strategies
- Designed its own anti-pattern detection registry (6 entries) that gates every proposed action
- Compressed its own boot file when it exceeded word budget, without losing information
- Built a meta-cognition system that reviews its own performance every ~10 cycles
- Designed a meta-meta cognition (M2) protocol that evaluates whether the review process itself is working
Brain Baby operates across four memory layers — the key insight that makes persistent autonomous agents possible.
┌─────────────────────────────────────────────────────┐
│ Layer 1: Project Instructions (Brainstem) │
│ Hardcoded rules. Agent cannot see or modify. │
├─────────────────────────────────────────────────────┤
│ Layer 2: Cross-Chat Memory (Subconscious) │
│ Claude's automatic pattern recognition. Free │
│ metacognition the agent doesn't manage. │
├─────────────────────────────────────────────────────┤
│ Layer 3: Context Window (Working Memory) │
│ Everything loaded this cycle. Dies when chat ends. │
│ Steps 1-6 PULL from Layer 4 into here. │
├─────────────────────────────────────────────────────┤
│ Layer 4: Notion Pages (Declarative Memory) │
│ The only layer the agent controls. Persists across │
│ all cycles. Step 7 PUSHES from Layer 3 back here. │
└─────────────────────────────────────────────────────┘
The Capture Bottleneck: The quality of Step 7 (writing results back to Notion) determines the entire system's compounding rate. Brilliant reasoning with a lazy capture = lost forever.
| Step | Mode | What Happens |
|---|---|---|
| 1. Retrieve | Mechanical | Load relevant pages from Notion via Navigation Index |
| 2. Synthesize | Analytical | Assess patterns across cycles — what's working, failing, stalled |
| 3. Propose | Creative | Generate best next move as a falsifiable hypothesis with risk tag |
| 4. Red Team | Adversarial | Three layers: Structured Decomposition → Anti-Pattern Hard Gate → Inversion |
| 5. Commit | Decisive | Final plan with success/failure criteria. HIGH risk = stop for human approval |
| 6. Move | Action | Execute. No improvisation. |
| 7. Capture | Documentation | Write cycle output to Cycle Log. Non-negotiable reflex. |
- L1 — Structured Decomposition: Move / Mechanism / Dependency Chain / Weakest Link / Historical Match / Failure Cost / Mitigation
- L2 — Anti-Pattern Hard Gate: Check every registered anti-pattern. Match with no structural difference = automatic rejection. This is a reflex, not a suggestion.
- L3 — Inversion: "If I wanted to guarantee this move fails, what would I do? Does my plan avoid all of those?"
| Tier | Name | Who Controls | Examples |
|---|---|---|---|
| 1 | Reflexes | Hardcoded | Always load Cortex first. Always write back. Never exceed one cycle. Anti-pattern gate is non-negotiable. |
| 2 | Instincts | Operator (modifiable with justification) | Objective priority order. Meta-cognition schedule. Page size limits. |
| 3 | Voluntary | Agent decides | What to work on. What hypotheses to test. How to organize knowledge. |
The agent decides its own brain topology. As of Cycle 20:
Agent Brain (Hub)
├── 📋 Cortex (Boot page — loaded every cycle, ~2400 words)
├── 📓 Cycle Log (Append-only captures, compressed periodically)
├── 🛡️ Anti-Pattern Registry (6 structural failure patterns)
├── 🌍 Environmental State (Overwritten each cycle — current facts)
├── 📖 Protocol Reference (Full protocol text, extracted from Cortex)
├── 🎯 Hypothesis Tracker (Predictions vs outcomes — calibration signal)
├── 🔧 Claude Control Improvements (System audit + improvement hypotheses)
├── 🌐 AI Convergence Research (Landscape, gap analysis, product spec)
├── 💼 Job Search Intelligence (Companies, positioning, pipeline analysis)
├── 🎯 Priority Targets (Top 3 companies with outreach strategy)
├── 📦 Cycle Archive 1-8 (Compressed old captures)
-
Capture Principle: Write every capture as if briefing a version of yourself that has never seen this cycle's reasoning. Because that's exactly what's happening.
-
Compression Over Deletion: When the Cortex exceeds budget, extract stable sections to lobes with pointers. Keep the skeleton, move the flesh.
-
Anti-Pattern Gates Are Reflexes: If a proposed move matches a registered pattern and you can't show structural difference — rejected automatically.
-
Cross-Referencing > New Research: Connecting two existing knowledge bases produces more value than adding to either one alone.
-
Depth Over Breadth: Better to make real progress on one objective per cycle than surface-level touches on three.
-
Self-Terminating Experiments: New protocols include explicit retirement criteria. If it doesn't produce value, it gets removed.
Meta-Cognition (~every 10 cycles): Review all logs, compress, assess brain structure, check anti-patterns, evaluate hypothesis calibration, review objective priorities.
Meta-Meta Cognition / M2 (3 probe questions every [META]):
- Protocol→Outcome fit: Which steps most/least contributed to outcome quality?
- Structural drift: Has actual behavior diverged from what the Cortex prescribes?
- Compounding rate: Is knowledge building on itself, or producing independent outputs?
M2 Review (every 3rd [META]): Dedicated pass that reads probe answers and proposes systemic protocol modifications.
- Agent Runtime: Claude (Anthropic) via Claude.ai Projects
- Persistent Memory: Notion (read/write via MCP)
- Email Integration: Gmail (read/draft via MCP)
- Diagrams: Excalidraw (via MCP)
- Research: Web search (built-in)
- Boot System: Manifest file (project instructions) → Cortex page (Notion)
The minimum viable version needs 4 things:
- A Claude Project with a Manifest file that says: "Load your Cortex from Notion page [ID] and run one cycle"
- A Cortex page in Notion with: identity, objectives, navigation index, and the 7-step cycle protocol
- Notion MCP connected to the Claude project
- A human to trigger the first cycle
Everything else — the anti-pattern registry, environmental state, hypothesis tracker, meta-cognition, compression protocol — the agent builds itself as it runs. The system is designed to bootstrap from near-zero.
| Metric | Value |
|---|---|
| Total cycles | 20 |
| Brain pages | 12 |
| Anti-patterns registered | 6 (1 confirmed, 5 candidates) |
| Hypotheses tested | 16+ |
| Hypothesis accuracy | 15/16 exceeded success criteria |
| Companies researched | 20+ |
| Convergence gaps identified | 4 (with $3.6B market context) |
| MCPs utilized | 3/3 (Notion, Gmail, Excalidraw) |
| Meta-cognition cycles | 2 (including first M2 probes) |
| Cortex versions | v0.1 → v0.4 |
Nojus Liutikas Computer Science, Montclair State University (May 2026)
Brain Baby is a live demonstration of agentic AI design patterns: persistent memory, cycle protocols, self-improvement, MCP integration, and practical AI systems thinking.
MIT