Cognitive Memory System (CMS) is the "long-term brain" for autonomous agents. Unlike standard RAG systems, CMS implements a multi-tiered memory architecture (Episodic, Semantic, and Goal-driven) with a self-correcting reinforcement learning loop. It doesn't just store data; it reasons, prioritizes, and forgets intentionally to maintain peak performance.
- Hybrid Retrieval Engine: Combines FAISS (dense vector search) and BM25 (sparse keyword search) with dynamic weight adjustment based on query intent.
- Multi-Role LLM Routing: Native support for specialized agents—Planner, Executor, Evaluator, Rewriter, and Guardrail—optimizing cost and latency.
- Autonomous Goal Tree: Implements a hierarchical goal system with dynamic priority decay, urgency multipliers, and sub-goal dependency management.
- Memory Consolidation: Automatically abstracts raw episodic logs into "Semantic Rules" and "Patterns" using asynchronous workers.
- Cognitive Self-Correction: Detects factual contradictions and penalizes "hallucinated" memories through a hard-relevance gate.
- Enterprise Concurrency: Thread-safe operations using custom
RWLockfor high-frequency agentic loops.
The system operates on four distinct layers:
- Working Memory: A short-term buffer for the immediate interaction context.
- Episodic Memory: Chronological logs of experiences (State-Action-Result-Reward).
- Semantic Memory: Consolidated "world knowledge" and extracted rules.
- Goal Memory: A strategic tree defining the agent's long-term objectives and current focus.
absolute-agentic-arch/
├── src/
│ ├── memory/
│ │ ├── vector_store.py # FAISS & BM25 Hybrid implementation
│ │ ├── long_term_goal.py # Goal Tree & Priority logic
│ │ └── episodic_memory.py # Experience logging & Consolidation
│ ├── core/
│ │ ├── engine.py # Main Agent Loop & Decision System
│ │ └── concurrency.py # Threading & RWLock utilities
│ ├── utils/
│ │ └── validators.py # JSON Schema & Hallucination filters
├
├── requirements.txt # Dependencies (LangChain, FAISS, rank_bm25)
└── main.py # System Entry Point
-
Clone the repository:
git clone https://github.com/yourusername/absolute-agentic-arch.git cd absolute-agentic-arch -
Install dependencies:
pip install -r requirements.txt
-
Required Libraries:
faiss-cpuorfaiss-gpulangchain_communitysentence-transformersrank_bm25numpy
from src.core.engine import CognitiveMemory
from langchain_openai import ChatOpenAI
# Initialize LLM
llm = ChatOpenAI(model="gpt-4-turbo")
# Initialize Brain
memory = CognitiveMemory(db_path="./memory_db", llms=llm)
# 1. Add a goal
memory.add_memory("Build a luxury AI frontend", category="goal", priority=0.9)
# 2. Agent Step: Observe state and determine action
state = "User wants a high-end website template with GSAP animations."
action, score = memory.agent_step(state)
print(f"Decision: {action} | Confidence Score: {score}")
# 3. Shutdown gracefully
memory.shutdown()Every interaction is evaluated. If a retrieved memory leads to a successful outcome (High Reward), its win_rate and importance are boosted. Conversely, memories that lead to "Failure Patterns" are penalized or flagged as "Contradictions."
To prevent "Context Pollution," the system performs Eviction based on a score of importance + confidence + (visits * 0.01). Low-value, noisy memories are automatically archived to maintain a slim, efficient vector space.
This project is licensed under the MIT License - see the LICENSE file for details.
Building a digital consciousness is a team effort. Feel free to open issues or submit PRs regarding RAG optimization, memory decay algorithms, or new LLM router roles.