A production-grade AI agent framework with 60+ composable skills, autonomous task execution, voice interaction, and persistent memory. Built on Anthropic's Claude Code as the foundation for a fully autonomous personal AI assistant.
After working extensively with AI assistants, I noticed a fundamental gap: every session starts from zero. There is no continuity, no memory of preferences, no ability to proactively take action. I wanted an AI system that:
- Remembers everything -- past decisions, preferences, learnings across sessions
- Acts autonomously -- executes multi-step workflows without constant supervision
- Composes capabilities -- chains specialized skills together for complex tasks
- Speaks and listens -- bidirectional voice interaction, not just text
Kaya is the result: a skill-based architecture where each capability is a self-contained module that Claude Code can discover, load, and execute. The system handles everything from calendar management and grocery shopping to security reconnaissance and multi-agent debates.
kaya/
skills/ # 60+ composable skill modules
Agents/ # Multi-agent orchestration and composition
AutonomousWork/ # Parallel task execution engine
CalendarAssistant/# Google Calendar automation
VoiceInteraction/ # Bidirectional voice (desktop + mobile)
Browser/ # Playwright-based browser automation
... # 55+ more skills
agents/ # Agent personality definitions and traits
bin/ # CLI tools and cron scripts
hooks/ # Git hooks and lifecycle automation
lib/ # Shared libraries (cron, daemon, messaging)
VoiceServer/ # ElevenLabs-powered TTS server
MEMORY/ # Persistent state, learnings, and context
Observability/ # System monitoring and health checks
KAYASECURITYSYSTEM/ # Security protocols and threat models
The AutonomousWork skill orchestrates parallel agent execution -- multiple Claude instances working on independent tasks simultaneously with branch-isolated git operations.
Skills are composable modules with standardized interfaces. Each skill exposes:
- A
SKILL.mdmanifest with triggers, workflows, and integration points - Optional TypeScript tooling in
Tools/directories - Workflow definitions in
Workflows/directories - Context files that load domain knowledge on demand
Bidirectional voice system supporting desktop (local mic/speaker) and mobile (Telegram) channels, powered by ElevenLabs TTS with configurable voice personalities per agent.
The MEMORY/ subsystem provides:
- Learning signals -- Pattern recognition across sessions with sentiment tracking
- State management -- Persistent JSON state for skills, work queues, and cron jobs
- Validation logs -- Configuration and work integrity checks
- Voice event history -- Timestamped voice interaction logs
The Agents/ skill enables dynamic agent composition with:
- Specialized agent roles (Engineer, Designer, Researcher)
- Personality trait mapping and voice assignment
- Parallel orchestration with branch isolation
- Council-style multi-agent debates
| Category | Skills | Description |
|---|---|---|
| Core | System, lib/core | System kernel and maintenance |
| Agents | Agents, AgentMonitor, Council, Simulation | Multi-agent orchestration and evaluation |
| Productivity | CalendarAssistant, Gmail, Kaya, DailyBriefing | Personal assistant capabilities |
| Development | AgentProjectSetup, CreateCLI, CreateSkill, Browser | Engineering and automation tools |
| Research | OSINT, Recon, FirstPrinciples, RedTeam | Intelligence gathering and analysis |
| Content | ContentAggregator, Fabric, Obsidian, KnowledgeGraph | Knowledge management and synthesis |
| Commerce | Shopping, Instacart, Cooking | Consumer automation |
| Communication | Telegram, VoiceInteraction, CommunityOutreach | Messaging and outreach |
| Security | WebAssessment, PromptInjection, KAYASECURITYSYSTEM | Security testing and protocols |
| Meta | SkillAudit, SpecSheet, Evals, KayaUpgrade | Self-improvement and quality |
- Runtime: Bun (TypeScript/JavaScript)
- AI Foundation: Claude Code (Anthropic)
- Voice: ElevenLabs TTS with WebSocket streaming
- Browser Automation: Playwright CLI (Browse.ts)
- Messaging: Telegram Bot API
- Calendar: Google Calendar CLI
- State: JSON-based persistent state with validation
- Scheduling: macOS launchd for cron-style automation
# Clone and install
git clone https://github.com/[user]/kaya.git ~/.claude
cd ~/.claude
bun run install.ts
# Start the voice server
cd VoiceServer && ./start.sh
# Launch Claude Code with Kaya loaded
claudeSee INSTALL.md for detailed setup instructions.
Each skill follows a standardized structure:
skills/ExampleSkill/
SKILL.md # Manifest: triggers, workflows, integration
_Context.md # Domain knowledge loaded on demand
Tools/ # TypeScript utilities
Workflows/ # Step-by-step workflow definitions
Skills are discovered and loaded dynamically by the CORE router based on keyword matching in user requests. The router reads each skill's USE WHEN trigger clause to determine relevance.
# Run the installer wizard
bun run install.ts
# Validate system integrity
# (within a Claude Code session)
/system integrity check
# Audit skill quality
/skill-audit- Installation Guide -- Prerequisites, setup, and configuration
- Architecture -- System design and data flow
- ADR-001: Skill-based Architecture
- ADR-002: Memory Persistence
- Voice Server -- TTS server setup and usage
MIT
- ai-assistant — Autonomous AI assistant powered by Claude Code
- mcp-toolkit-server — MCP server toolkit for Claude AI integration
- context-engineering-toolkit — Context window optimization tools