Understand how your retrieval changes.
TraceOwl is an observability tool for retrieval systems.
It captures, compares, and explains differences in VectorDB search results so you can quickly understand what changed and where to focus your review.
- Capture real retrieval queries via a proxy
- Compare before/after search results
- Show what changed (documents, ranking, scores)
- Highlight queries you should review first
A report showing what changed between two retrieval runs.
- Your client application configured to use a VectorDB (e.g. Qdrant, Pinecone)
- A running VectorDB (Qdrant or Pinecone-compatible)
- Docker (optional, for running TraceOwl components)
- Pull the TraceOwl Proxy/Analyzer image
# Proxy (open-source)
docker pull ghcr.io/yito88/traceowl-proxy:v1.0.0
# Analyzer (license required)
docker pull ghcr.io/yito88/traceowl-analyzer:v1.0.0- Download the TraceOwl Diff binary from the release page (optional, for diffing event files)
- The analyzer requires a license to run. Trial access is available - contact contact@traceowl.org
The proxy sits in front of your vector DB and captures search events.
docker run -d --name traceowl-proxy \
-v ./proxy.toml:/config.toml:ro \
-v ./data:/data \
traceowl-proxy /config.tomlproxy.toml (local mode):
backend = "qdrant"
listen_addr = "0.0.0.0:6333"
upstream_base_url = "http://localhost:6334" # your vector DB
sampling_rate = 1.0
[sink]
mode = "local_only"
local_output_root = "/data"The proxy exposes a control API at http://localhost:6333/control/.
Wait until it is ready:
curl http://localhost:6333/control/statuscurl -s -X POST http://localhost:6333/control/tracing/start \
-H 'Content-Type: application/json' \
-d '{"sampling_rate": 1.0}'Response:
{
"status": "started",
"session_id": "20260419T120000Z_a1b2c3"
}Save the session_id — you will need it for the analyzer.
curl -s -X POST http://localhost:6333/control/tracing/stop \
-H 'Content-Type: application/json'Response:
{
"status": "stopped",
"stopped_session_id": "20260419T120000Z_a1b2c3",
"local_output_prefix": "events/20260419T120000Z_a1b2c3/",
"remote_output_prefix": "events/20260419T120000Z_a1b2c3/",
"upload_status": "not_configured"
}In S3 mode, wait for upload_status to leave "pending" before proceeding — the proxy uploads files asynchronously after the session closes.
If you have the analyzer, you can skip this step and point it directly at the raw event files.
traceowl-diff \
--baseline data/<baseline-session-id>/*.jsonl \
--candidate data/<candidate-session-id>/*.jsonl \
--output diff.jsonldocker run --rm \
-v ./data:/data \
-v ./license.json:/license.json \
traceowl-analyzer \
--license /license.json \
analyze \
--baseline-dir /data/<baseline-session-id> \
--candidate-dir /data/<candidate-session-id> \
--output-html /data/report.html \
--summary-json /data/summary.json- Architecture — how the components fit together
- Installation — build from source or Docker
- Operation — configuration reference and full API walkthrough
Questions, feedback, or trial requests: contact@traceowl.org


