Skip to content

Security: arthurpanhku/DocSentinel

Security

SECURITY.md

Security Policy | 安全策略

This document covers vulnerability disclosure and security-related practices for the DocSentinel project — an AI-powered SSDLC platform. It aligns with PRD §7.2 Security Requirements and Controls.

本文档涵盖 DocSentinel 项目(AI 驱动的 SSDLC 平台)的漏洞披露与安全实践,遵循 PRD §7.2 安全需求与控制


Supported Versions | 支持版本

Version Supported
4.0.x
3.1.x
3.0.x
2.0.x ⚠️ Limited
< 2.0

Reporting a Vulnerability | 漏洞报告

If you discover a security vulnerability, please report it responsibly:

  1. Do not open a public GitHub issue for security-sensitive findings.
  2. Email the maintainers (e.g. the contact in the PRD: u3638376@connect.hku.hk) with:
    • A description of the vulnerability and steps to reproduce.
    • Impact and suggested fix if possible.
  3. We will acknowledge receipt and aim to respond within a reasonable timeframe. We may ask for more details and will keep you updated on remediation and disclosure.

如果您发现了安全漏洞,请负责任地进行报告:

  1. 请勿针对敏感安全问题提交公开的 GitHub Issue。
  2. 发送邮件给维护者(联系方式见 PRD:u3638376@connect.hku.hk),包含:
    • 漏洞描述与复现步骤。
    • 影响范围与建议修复方案(如有)。
  3. 我们将在合理时间内确认收到并回复。可能会向您询问更多细节,并同步后续的修复与披露进度。

Security-Related Configuration | 安全相关配置

  • Secrets: Do not commit .env or any file containing SECRET_KEY, API keys, or passwords. Use .env.example as a template only.

  • Input Validation: File type and size limits are enforced (see UPLOAD_MAX_FILE_SIZE_MB, UPLOAD_MAX_FILES). Only allowed extensions are parsed (see app/parser/service.py).

  • MCP Document Roots: assess_document.file_path is confined to MCP_DOCUMENT_ROOTS before any file read. Configure this to the smallest approved document directory; never expose the MCP server with broad roots such as /, a user home directory, or a shared workspace containing secrets.

  • KB Reindex Roots: /api/v1/kb/reindex is confined to KB_REINDEX_ROOTS, including resolved symlink targets. Do not reuse a broad application working directory as the reindex root.

  • Prompt Injection Guardrails: Input sanitization via regex pattern detection and length limits is enforced before content reaches the LLM (see app/core/guardrails.py). Malicious inputs are rejected with HTTP 400.

  • TLS: In production, use HTTPS and TLS 1.2+ for all endpoints and external calls (PRD §7.2 DATA-01).

  • Auth: API currently does not enforce authentication in the MVP; add AAD/API Key as per PRD §7.2 IAM before exposing externally.

  • LangGraph State: Assessment state and checkpoints may contain sensitive document content. Ensure LANGGRAPH_CHECKPOINT_DIR is on encrypted storage in production.

  • SAST/DAST Integration: When ingesting scan results from external tools, validate report integrity and source authenticity.

  • 机密信息:请勿提交 .env 或任何包含 SECRET_KEY、API Key、密码的文件。.env.example 仅作为模板使用。

  • 输入验证:强制执行文件类型与大小限制(见 UPLOAD_MAX_FILE_SIZE_MBUPLOAD_MAX_FILES)。仅解析允许的扩展名(见 app/parser/service.py)。

  • MCP 文档根目录assess_document.file_path 必须在任何文件读取之前被限制在 MCP_DOCUMENT_ROOTS 内。请将该值配置为最小必要的批准文档目录;不要在对外暴露 MCP server 时使用 /、用户 home 目录或包含密钥的共享工作区等宽泛根目录。

  • 知识库重建根目录/api/v1/kb/reindex 必须限制在 KB_REINDEX_ROOTS 内,并检查解析后的 symlink 目标。不要直接把宽泛的应用工作目录作为重建根目录。

  • 提示注入防护:通过正则模式检测和长度限制对输入进行清洗,在内容到达 LLM 之前执行(见 app/core/guardrails.py)。恶意输入将被 HTTP 400 拒绝。

  • TLS:生产环境中,所有端点与外部调用必须使用 HTTPS 和 TLS 1.2+(PRD §7.2 DATA-01)。

  • 认证:MVP 阶段 API 暂未强制认证;在对外暴露前,请根据 PRD §7.2 IAM 添加 AAD/API Key 认证。

  • LangGraph 状态:评估状态和检查点可能包含敏感文档内容。生产环境中请确保 LANGGRAPH_CHECKPOINT_DIR 位于加密存储上。

  • SAST/DAST 集成:从外部工具接入扫描结果时,请验证报告完整性和来源真实性。


Secure Development Guidelines | 安全开发准则

Use the following principles when adding new API, MCP, parser, KB, or agent features.

新增 API、MCP、Parser、KB 或 Agent 功能时,请遵循以下原则。

Treat External Inputs as Authority Requests | 将外部输入视为权限请求

Any value controlled by an API client, MCP caller, LLM tool call, browser UI, uploaded file, or environment-adjacent integration is untrusted. Before using it to access local resources, ask:

  • Who supplied this value?
  • Whose authority will execute the action?
  • What boundary proves this caller is allowed to do it?

API client、MCP caller、LLM tool call、浏览器 UI、上传文件或外部集成提供的值都不可信。使用它访问本地资源前,应先问:

  • 这个值是谁提供的?
  • 实际执行动作的是谁的权限?
  • 有什么边界能证明调用者被允许这样做?

File and Path Handling | 文件与路径处理

File paths are not ordinary strings. A caller-controlled path asks the server process to use server-side filesystem permissions.

  • Resolve paths with Path.resolve() or equivalent realpath semantics before access.
  • Check that the resolved path is inside an explicit allow-root such as MCP_DOCUMENT_ROOTS or KB_REINDEX_ROOTS.
  • Validate symlink targets after resolution; symlinks must not escape the allow-root.
  • Reject directories, devices, sockets, and other non-regular files.
  • Validate file extension and size before reading content.
  • Never use extension allow-lists as a substitute for directory confinement.
  • Add tests for absolute paths, .., symlink escape, unsupported extensions, and missing files.

文件路径不是普通字符串。调用者可控路径意味着调用者请求 server 进程使用 server 端文件系统权限。

  • 访问前使用 Path.resolve() 或等价 realpath 语义解析路径。
  • 检查解析后的路径是否位于显式允许根目录内,例如 MCP_DOCUMENT_ROOTSKB_REINDEX_ROOTS
  • 解析后检查 symlink 目标;symlink 不得逃逸允许根目录。
  • 拒绝目录、设备文件、socket 和其他非普通文件。
  • 在读取内容前验证扩展名和大小。
  • 不要把扩展名白名单当作目录访问控制。
  • 为绝对路径、..、symlink 逃逸、不支持扩展名和不存在文件添加测试。

MCP and Agent Tools | MCP 与 Agent 工具

MCP tools and A2A messages are security boundaries because an agent may call them based on user input or prompt-injected instructions.

  • Keep tool scopes narrow and explicit.
  • Prefer IDs, handles, or uploaded document references over arbitrary local paths.
  • If a tool must touch local files, require an allow-root and document the expected configuration.
  • Keep tokenless remote protocols confined to loopback. Require AGENT_GATEWAY_TOKEN, TLS, and an upstream identity layer before network exposure.
  • Keep MCP DNS-rebinding protection enabled. Set AGENT_GATEWAY_ALLOWED_HOSTS and AGENT_GATEWAY_ALLOWED_ORIGINS to the smallest production allow-lists that work.
  • Do not let an agent disable collaborative review or approve its own assessment.
  • Treat MCP tools as bounded capabilities and A2A as task delegation; both must call the same application service and policy checks.
  • Return minimal error details; do not echo sensitive paths or content.
  • Assume tool output may be visible to the caller and may be copied into an LLM transcript.

MCP 工具和 A2A 消息都是安全边界,因为 agent 可能基于用户输入或 prompt injection 指令调用它们。

  • 保持工具作用域小而明确。
  • 优先使用 ID、handle 或上传文档引用,而不是任意本地路径。
  • 如果工具必须访问本地文件,必须要求允许根目录并记录配置方式。
  • 未配置 token 时,远程协议只能绑定本机回环地址。对网络开放前必须配置 AGENT_GATEWAY_TOKEN、TLS 和上游身份认证。
  • 保持 MCP DNS rebinding 防护开启,并将 AGENT_GATEWAY_ALLOWED_HOSTSAGENT_GATEWAY_ALLOWED_ORIGINS 收敛到最小可信范围。
  • 不允许 agent 关闭协作评审,也不允许 agent 审批自己的评估结果。
  • MCP 用于受限工具能力,A2A 用于任务委派;两者必须复用相同的应用服务和策略检查。
  • 返回最小必要错误信息;不要回显敏感路径或内容。
  • 假设工具输出会被调用者看到,也可能进入 LLM transcript。

LLM Data Flow | LLM 数据流

Anything sent to an LLM provider can leave the local process. Before passing data to the LLM:

  • Confirm the data was intentionally selected by an authorized workflow.
  • Avoid sending secrets, credentials, raw .env content, private keys, or unrelated local files.
  • Preserve citations and metadata so generated findings can be audited.
  • Make local-model and private-deployment modes clear for sensitive use cases.

任何发送给 LLM provider 的内容都可能离开本地进程。传给 LLM 前应确认:

  • 数据是由授权工作流有意选择的。
  • 避免发送密钥、凭据、原始 .env 内容、私钥或无关本地文件。
  • 保留引用和元数据,方便审计生成结果。
  • 对敏感场景清楚说明本地模型和私有部署模式。

Required Review Checklist | 必要 Review 检查清单

Before merging security-relevant changes, reviewers should verify:

  • New external inputs have validation, authorization, and boundary checks.
  • File access is confined before open(), parser invocation, indexing, or LLM processing.
  • Tests cover the denied path as well as the successful path.
  • Documentation and .env.example describe any new security-sensitive setting.
  • The change has been checked with ruff, pytest, and relevant frontend build/tests.

合并安全相关改动前,reviewer 应确认:

  • 新增外部输入具备验证、授权和边界检查。
  • 文件访问在 open()、parser 调用、索引或 LLM 处理前已完成范围限制。
  • 测试覆盖拒绝路径和成功路径。
  • 文档和 .env.example 描述了新的安全敏感配置。
  • 改动已通过 ruffpytest 以及相关前端 build/tests。

References | 参考

There aren't any published security advisories