pip install cognis-embedaudit
embedaudit scan . # → prioritized findings in seconds
-
Install the CLI:
pip install embedaudit
-
Audit a snapshot of your vector store — a JSONL file of embedding records — for near-duplicates and single-vector domination:
embedaudit audit snapshot.jsonl --dup-threshold 0.999 --domination-share 0.30
-
Compare against a trusted baseline to catch drift / poisoning between two snapshots:
embedaudit drift baseline.jsonl current.jsonl --drift-threshold 0.15
-
Read the output. Add
--format jsonfor a machine-readable report and a non-zero exit code when findings exceed your thresholds:embedaudit audit snapshot.jsonl --format json > report.json -
Wire it into CI — fail the build when an index regresses:
embedaudit drift baseline.jsonl current.jsonl --format json || exit 1
- Why embedaudit? · Features · Quick start · Example · Architecture · AI stack · How it compares · Integrations · Install anywhere · Related · Contributing
RAG ops niche
embedaudit is single-purpose, scriptable, and self-hostable: point it at a target, get prioritized results in the format your workflow already speaks (table · JSON · SARIF), gate CI on it, and let agents drive it over MCP.
-
✅ Load Jsonl
-
✅ Audit Store
-
✅ Drift Report
-
✅ Runs on Linux/macOS/Windows · Docker · devcontainer
-
✅ Ports in Python, JavaScript, Go, and Rust (
ports/)
pip install cognis-embedaudit
embedaudit --version
embedaudit scan . # scan current project
embedaudit scan . --format json # machine-readable
embedaudit scan . --fail-on high # CI gate (non-zero exit)
$ embedaudit scan .
[HIGH ] EMB-001 example finding (./src/app.py)
[MEDIUM ] EMB-002 another signal (./config.yaml)
2 findings · risk score 5 · 38ms
flowchart LR
IN[target / manifest] --> P[embedaudit<br/>checks + rules]
P --> OUT[findings (JSON / SARIF)]
embedaudit is interoperable with every popular way of using AI:
-
MCP server —
embedaudit mcp(Claude Desktop, Cursor, Cognis.Studio, uncensored-fleet) -
OpenAI-compatible / JSON — pipe
embedaudit scan . --format jsoninto any agent or LLM -
LangChain · CrewAI · AutoGen · LlamaIndex — wrap the CLI/JSON as a tool in one line
-
CI / scripts — exit codes + SARIF for non-AI pipelines
| | Cognis embedaudit | RAG security |
|---|:---:|:---:|
| Self-hostable, no account | ✅ | varies |
| Single command, zero config | ✅ |
| JSON + SARIF for CI | ✅ | varies |
| MCP-native (AI agents) | ✅ | ❌ |
| Polyglot ports (JS/Go/Rust) | ✅ | ❌ |
| Open license | ✅ COCL | varies |
Built in the spirit of RAG security, re-framed the Cognis way. Missing a credit? Open a PR.
Pipes into your stack: SARIF for code-scanning, JSON for anything, an MCP server (embedaudit mcp) for AI agents, and a webhook forwarder for SIEM/Slack/Jira. See docs/INTEGRATIONS.md.
pip install "git+https://github.com/cognis-digital/embedaudit.git" # pip (works today)
pipx install "git+https://github.com/cognis-digital/embedaudit.git" # isolated CLI
uv tool install "git+https://github.com/cognis-digital/embedaudit.git" # uv
pip install cognis-embedaudit # PyPI (when published)
docker run --rm ghcr.io/cognis-digital/embedaudit:latest --help # Docker
brew install cognis-digital/tap/embedaudit # Homebrew tap
curl -fsSL https://raw.githubusercontent.com/cognis-digital/embedaudit/main/install.sh | sh
| Linux | macOS | Windows | Docker | Cloud |
|---|---|---|---|---|
| scripts/setup-linux.sh | scripts/setup-macos.sh | scripts/setup-windows.ps1 | docker run ghcr.io/cognis-digital/embedaudit | DEPLOY.md (AWS/Azure/GCP/k8s) |
-
duckprobe— Zero-setup data-quality checks on any file or warehouse via DuckDB -
schemadrift— Schema-change detector and data-contract tests -
csvlens— Fast CLI for profiling and cleaning huge CSV / Parquet files -
piiscan— PII discovery across warehouses and lakes (data-side scanner) -
lineagemap— Column-level lineage extracted from SQL and dbt -
datasetcard— Auto Dataset Cards / datasheets with Croissant + provenance
Explore the suite → 🗂️ all 170+ tools · ⭐ awesome-cognis · 🔗 cognis-sources · 🤖 uncensored-fleet · 🧠 engram
PRs, new rules, and demo scenarios are welcome under the collaboration-pull model — see CONTRIBUTING.md and SECURITY.md.
{} composes with the 300+ tool Cognis suite — JSON in/out and a shared
OpenAI-compatible /v1 backbone. See INTEROP.md for the
suite map, composition patterns, and reference stacks.
Source-available under the Cognis Open Collaboration License (COCL) v1.0 — free for personal, internal-evaluation, research, and educational use; commercial / production use requires a license (licensing@cognis.digital). See LICENSE.