Hierarchical Memory for AI Agents on Qwen Cloud
"Not all context is equal." โ Confucius Paper (Meta + Harvard, arXiv:2512.10398)
Confucius Agent implements the 3-tier hierarchical memory system from the Confucius paper on top of Qwen Cloud. Unlike traditional RAG that treats all context equally, our architecture distinguishes between:
| Tier | Priority | Type | Storage | Description |
|---|---|---|---|---|
| ๐๏ธ Mental Models | ๐ด Highest | Canonical knowledge | ChromaDB + Vector Search | Company policies, rules, verified facts |
| ๐ Observations | ๐ก Medium | Persistent learnings | PostgreSQL + Time-index | Patterns, decisions, notes from sessions |
| ๐ฆ Raw Facts | ๐ข Lowest | Ephemeral context | Redis + TTL decay | Conversation logs, temporary data |
Result: An agent that never contradicts itself, reduces token consumption by up to 60%, and retrieves relevant information in milliseconds.
๐ Try it now without installing anything:
๐ https://confucius.wagnersolutionsai.comUpload documents, chat with the agent, and inspect the 3-tier memory in real time.
- Python 3.12+
- Docker & Docker Compose
- Qwen Cloud API key (or any OpenAI-compatible API for dev)
git clone https://github.com/your-org/confucius-agent.git
cd confucius-agent
cp .env.example .env
# Edit .env with your API keysdocker compose up -dStarts: Redis (Raw Facts), PostgreSQL (Observations), ChromaDB (Mental Models)
pip install -r requirements.txt
streamlit run demo/app.pyOpen http://localhost:8501 and ask anything. The agent:
- ๐๏ธ Checks Mental Models first (canonical truth)
- ๐ Queries Observations (past learnings)
- ๐ฆ Reviews Raw Facts (current context)
- ๐ง Answers with priority-ranked knowledge
confucius-agent/
โโโ confucius/ # Core library
โ โโโ __init__.py
โ โโโ config.py # Dual API config (Qwen โ Fallback)
โ โโโ qwen_client.py # OpenAI-compatible client
โ โโโ agent.py # Main agent with tool calling
โ โโโ memory/
โ โโโ mental_models.py # Layer 1: ChromaDB + embeddings
โ โโโ observations.py # Layer 2: PostgreSQL + time-index
โ โโโ raw_facts.py # Layer 3: Redis + TTL decay
โ โโโ retrieval_pipeline.py # Priority-based orchestrator
โโโ demo/
โ โโโ app.py # Streamlit interface
โโโ tests/ # Test suite
โโโ docker-compose.yml # Infrastructure as code
โโโ Dockerfile # Demo container
โโโ requirements.txt # Python dependencies
Develop with any OpenAI-compatible API (DeepSeek, Kimi, OpenAI), then switch to Qwen Cloud for submission:
# In .env:
API_MODE=fallback # Use during development
FALLBACK_API_KEY=sk-... # Your dev API key
FALLBACK_BASE_URL=https://api.deepseek.com/v1
# Switch to Qwen Cloud for submission:
API_MODE=qwen
QWEN_API_KEY=sk-... # Qwen Cloud hackathon credits
QWEN_BASE_URL=https://dashscope-intl.aliyuncs.com/compatible-mode/v1User Query
โ
โผ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Parallel Query All Tiers โ
โโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโฌโโโโโโโโโโโโค
โ ๐๏ธ Mental โ ๐ Obser- โ ๐ฆ Raw โ
โ Models โ vations โ Facts โ
โ (ChromaDB) โ (Postgres) โ (Redis) โ
โโโโโโโโโฌโโโโโโโโโดโโโโโโฌโโโโโโโดโโโโโโฌโโโโโโ
โ โ โ
โผ โผ โผ
Priority Recency TTL-based
Score + Weight ร Age Check
Threshold Confidence
โ โ โ
โโโโโโโโโโโโโโโโดโโโโโโโโโโโโโ
โ
โผ
Ranked Context (by priority)
โ
โผ
LLM (Qwen Cloud) โ Response
pytest tests/ -vMIT โ Open source for the Qwen Cloud Global AI Hackathon 2026
Built with โค๏ธ by Wagner Solutions AI for the MemoryAgent track.