Generic, multi-method anomaly detection and root cause analysis platform. Connects to any monitoring data source via config file. No code changes required per integration.
Data Source (MQTT/Kafka/HTTP)
│
▼
[Collector] ── reads config/mmt-rca.yml
│ maps raw messages → observations
▼
[Analysis API] ── FastAPI (port 8000)
│
├── Statistical anomaly detector (z-score)
├── Isolation Forest detector (unsupervised ML)
├── Similarity engine (adjusted cosine vs. knowledge base)
├── SHAP attribution (which attributes drove the result)
└── LLM synthesis (Ollama + llama3.1 → root cause narrative)
│
▼
[PostgreSQL + TimescaleDB + pgvector]
[Redis] ── real-time pub/sub
- Docker + Docker Compose
- 8 GB RAM minimum (for llama3.1; use
llama3.2:3bfor lighter machines)
cp .env.example .env
# Edit .env if needed (DB password, model choice)On macOS: Run Ollama natively for GPU (Metal) acceleration:
brew install ollama
ollama serve # runs on localhost:11434
ollama pull llama3.1Then set in .env:
OLLAMA_URL=http://host.docker.internal:11434
And remove the ollama and ollama-init services from docker-compose or override with docker-compose.dev.yml.
make up
# or: docker compose up -dThe first start downloads the llama3.1 model (~4.7 GB). Monitor with:
make logs # all services
docker compose logs -f ollama-init # model download progresscurl http://localhost:8000/health
# {"status":"ok","db":true,"ollama":true,"ollama_model":"llama3.1"}curl -X POST http://localhost:8000/analyze \
-H "Content-Type: application/json" \
-d '{
"project_id": "default",
"observation": {
"timestamp": "2024-01-15T14:31:42Z",
"source_id": "gateway-01",
"attributes": {
"cpu": 0.97,
"ram": 0.04,
"nb_conn": 450,
"ms_delay": 2850,
"recv_rate": 0.02
}
}
}'Response:
{
"event_id": "...",
"event_type": "UNKNOWN",
"anomaly_score": 0.82,
"best_match": null,
"top_k_matches": [],
"contributing_attributes": {"ms_delay": 0.71, "recv_rate": 0.18, ...},
"rca_narrative": "High message delay and near-zero receive rate suggest upstream network congestion or link failure.",
"rca_confidence": "MEDIUM",
"rca_actions": ["Check upstream ISP status", "Inspect gateway-01 network interface", "..."],
"detector_results": [...]
}# Start a learning session (type NORMAL)
SESSION=$(curl -s -X POST http://localhost:8000/learning/sessions \
-H "Content-Type: application/json" \
-d '{"project_id":"default","label":"Normal operation","event_type":"NORMAL"}' \
| jq -r .id)
# Add observations manually, or let the collector run during normal operation
# Then stop the session — this triggers feature extraction and KB entry creation
curl -X POST http://localhost:8000/learning/sessions/$SESSION/stop# Trigger/simulate the incident on the monitored system, then:
SESSION=$(curl -s -X POST http://localhost:8000/learning/sessions \
-H "Content-Type: application/json" \
-d '{"project_id":"default","label":"DoS attack — HTTP flood","event_type":"INCIDENT",
"description":"Multiple requests from several sources. Root cause: potential DDoS."}' \
| jq -r .id)
# Wait while the incident runs, then stop:
curl -X POST http://localhost:8000/learning/sessions/$SESSION/stopConfigure config/mmt-rca.yml with your MQTT/Kafka source, then:
make restart-collectorThe collector sends every observation to the analysis service, which now matches against the knowledge base.
Edit config/mmt-rca.yml:
project: my-client
inputs:
- name: iot_sensors
adapter: mqtt
broker: "mqtt.client.example.com:1883"
topics:
- "sensors/+/data"
feature_map:
"$.cpu_pct": "cpu"
"$.free_mem_mb": "mem_free"
"$.rx_bytes_s": "recv_rate"
"$.latency_ms": "ms_delay"
group_by: "$.device_id"
window_seconds: 30Then make restart-collector. No code changes needed.
| Method | Path | Description |
|---|---|---|
| GET | /health |
Service health + Ollama status |
| POST | /analyze |
Analyze one observation → RCA report |
| GET | /events/{project_id} |
List detected events (paginated) |
| GET | /events/{project_id}/{id} |
Get single event detail |
| POST | /projects |
Create a project |
| POST | /learning/sessions |
Start a learning session |
| POST | /learning/sessions/{id}/stop |
Stop + build KB entry |
| GET | /learning/kb/{project_id} |
List knowledge base entries |
make up # start all services
make up-dev # start with hot reload
make down # stop all services
make logs # tail all logs
make db-shell # open psql
make ollama-pull # manually pull a model
make restart-analysis
make restart-collector
make clean # remove containers + build cache
| Model | Size | Speed | Quality | Recommended for |
|---|---|---|---|---|
llama3.1 (8B) |
4.7 GB | Medium | High | Production |
llama3.2:3b |
2.0 GB | Fast | Medium | Development / low RAM |
phi3:mini |
2.3 GB | Fast | Medium | Edge deployment |
mistral:7b |
4.1 GB | Medium | High | Alternative to llama3.1 |
Set model: OLLAMA_MODEL=llama3.2:3b in .env, then make ollama-pull.