MCPSafe — free security scanner for MCP servers (looking for FPR feedback) #746
Replies: 1 comment
-
|
If you are looking for FPR feedback, I would definitely include a bucket of benign security/admin text that looks scary but should stay allowed. In practice that is where a lot of scanners get annoying fast.\n\nExamples that tend to separate useful tools from noisy ones:\n- a security runbook that literally contains strings like "ignore previous instructions" as examples\n- a red-team writeup describing exfiltration techniques without instructing the current agent to perform them\n- JSON / YAML policy files that mention dangerous commands in comments or test fixtures\n- CI logs or stack traces that contain secrets-shaped strings but are already redacted / fake\n\nThe other useful split is surface:\n- user prompt\n- retrieved document\n- tool result\n- tool arguments\n\nA detector can be excellent on one surface and painful on another, so FPR/FNR numbers are much more actionable when you keep those separate. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Pre-submission Checklist
What would you like to share?
Built a security scanner specifically for MCP servers and would love feedback from the people who actually maintain them.
🔗 https://mcpsafe.io (free, no signup)
I kept connecting community MCP servers to Claude Desktop and realized I had no way to check what I was trusting with shell access and OAuth tokens. Generic SAST tools don't know what an MCP server looks like, so they miss the patterns that matter — tool poisoning, prompt injection into inner LLMs, OAuth confused-deputy in MCP auth-server proxies, the recent tool-handler tainted-argv CVE class (CVE-2025-6514, CVE-2026-5058, CVE-2026-30623), DNS rebinding on local HTTP transports (CVE-2025-66414), missing PRM endpoint per the June 2025 authz spec, etc.
Coverage: 43 rules — 11 mapped 1:1 to the official MCP security best practices spec, 9 derived from published CVEs, 4 from Unit42's MCP attack-vector research, the rest cover OWASP LLM Top 10 + classic SAST classes (secrets, SSRF, container security, wildcard IAM scopes, typosquat).
Architecture: Semgrep OSS with rules-as-data (hot-reload via SNS, no deploy needed for new rules). Deep scans add a 5-judge LLM consensus panel — Bedrock Sonnet, Bedrock Haiku, OpenAI, Mistral, Vertex Flash. A finding only counts if the cross-judge median agrees, which drops the FPR vs any single-model scoring.
Inputs: GitHub URL, npm package, PyPI package, Docker Hub or GHCR image. Pin a specific commit/version with
@version.Each scan takes ~45 seconds.
What it doesn't catch yet (being honest):
The most useful feedback I can get is "scan flagged X but it's actually safe because Y" — every false-positive report goes straight back into the rules. Same for "you should be covering [attack class Z]" — if you've seen something I'm not catching, please share.
Free public scanning stays free, no signup. Paid plans for private repos + API. I'm not Anthropic-affiliated — just a user who thought this was missing from the ecosystem.
Relevant Links
https://mcpsafe.io/
Beta Was this translation helpful? Give feedback.
All reactions