Português (Brasil): SECURITY.pt_BR.md
This document describes which versions of the application are supported, which dependency baseline is expected, and how to report security vulnerabilities to the maintainers. For copyright and license: NOTICE; for making copyright and trademark official: docs/COPYRIGHT_AND_TRADEMARK.md (pt-BR).
Governance posture index (PII gates, agent containment, supply chain, provenance): docs/SECURITY_GOVERNANCE_POSTURE_HUB.md (pt-BR) — co-located links only; does not replace this vulnerability policy.
- Application (brand): Data Boar. PyPI distribution id:
data-boar(see CONTRIBUTING.md). Current development targets Python 3.12+. - We aim to support the latest stable minor versions of Python 3.12 and 3.13 on Linux, macOS and Windows.
- Older Python versions (< 3.12) are not tested and should be considered unsupported.
pyproject.toml is the source of truth for the uv toolchain. The uv.lock file pins the exact resolved dependency tree so that installs are reproducible and users are protected from accidental breakage when a dependency updates (“it worked yesterday”). pip and requirements.txt are derivative (requirements.txt is exported from the lockfile for pip-based or legacy environments). Dependencies are declared in pyproject.toml and managed via uv:
-
To install in a fresh environment (uses uv.lock for reproducible versions):
uv sync
-
To export a locked requirements.txt for environments that use plain pip (same versions as uv.lock):
uv export --no-emit-package pyproject.toml -o requirements.txt
Committing uv.lock is a supply-chain control: it pins the full resolved dependency tree (direct and transitive) so uv sync and CI install the same package versions from hashes recorded in the lockfile. That reduces unreviewed drift from floating >= resolution on PyPI, “works on my machine” breakage, and mismatch between uv.lock and a pip-only requirements.txt export.
The lockfile does not prove that those pins are free of known CVEs or non-malicious. Vulnerability mitigation on the Python install path layers on top of the lockfile: CI pip-audit, GitHub Dependabot (this repo uses package-ecosystem: uv so PRs move pyproject.toml and uv.lock together—see ADR 0044), maintainer review, and optional SBOM artifacts (below). When you triage advisories or apply updates, refresh uv.lock through the single-pass closure (pyproject.toml → uv lock → uv export) in ADR 0030—do not hand-edit the lockfile to “fix” audits.
On Ubuntu/Debian you should have at least:
sudo apt update
sudo apt install -y \
python3.12 python3.12-venv python3.12-dev build-essential \
libpq-dev libssl-dev libffi-dev unixodbc-devAdditional client libraries may be required depending on which connectors you use (e.g. Oracle, SQL Server, Snowflake); see the main README.md for connector-specific notes.
Formal CycloneDX JSON SBOMs support supply-chain visibility and incident response (see docs/adr/0003). They complement pip-audit; they are not organizational risk management under ISO 31000 (see COMPLIANCE_FRAMEWORKS.md).
| Artifact | Contents | How it is produced |
|---|---|---|
sbom-python.cdx.json |
Python dependencies aligned with uv.lock (via uv export + cyclonedx-py) |
Workflow SBOM, local scripts/generate-sbom.ps1 |
sbom-docker-image.cdx.json |
Packages in the built OCI image (OS + Python layers) | syft in anchore/syft:v1.28.0 against image data_boar:sbom built from the Dockerfile at the same commit |
Where to download: GitHub Actions workflow SBOM uploads both files as workflow artifacts (runs on version tags v*, on release: published, on workflow_dispatch, and on path-filtered PRs to main). When a GitHub Release already exists for the tag, the same files are attached to that release.
Docker Hub: When you follow docs/ops/DOCKER_IMAGE_RELEASE_ORDER.md, the published image fabioleitao/data_boar:<semver> should match the same source tree as the tag used for the SBOM workflow; the image SBOM is from a local build in CI (equivalent layers to a clean docker build at that commit), not from a separate registry pull.
- Dependencies in
pyproject.tomluse minimum versions (>=) so security patches are allowed; pin exact versions (==) only where necessary. The lockfile (uv.lock) is committed so that everyone (and CI) installs the same tree; it is refreshed when dependencies change or before a stable release so the app stays updated, compatible, and safe. Dependabot (see.github/dependabot.yml) opens weekly PRs for pip and GitHub Actions and helps signal when to act: when you apply an update (or before a release), updatepyproject.tomlfirst, then runuv lockanduv export --no-emit-package pyproject.toml -o requirements.txt, and commit pyproject.toml, uv.lock, and requirements.txt. Do not merge a change that only editsrequirements.txtoruv.lockwithout updating the other. Merge dependency PRs only after CI (tests and audit) pass.
The trigger for a change (CI, Dependabot, Docker Scout, review feedback, maintainer choice, or another signal) does not change the workflow. When you decide an update is justified and safe after tests and audit:
- Express intent in
pyproject.toml, thenuv lockanduv export --no-emit-package pyproject.toml -o requirements.txt—commit all three together. - Run
uv synclocally so.venvmatches the lockfile. - Run
.\scripts\check-all.ps1(full gate) before merge—no “half green” dependency PRs. - Refresh SBOM artifacts when your release or compliance path requires an updated bill of materials at the same commit—see ADR 0003 and
scripts/generate-sbom.ps1/ workflowSBOM. - Add or update an ADR when the bump reflects a policy or architecture choice (optional extras boundaries, toolchains, recorded upstream constraints), not for every routine patch.
This is not permission to churn dependencies blindly; defer or reject changes that lack rationale or fail gates. Recorded decision: ADR 0030.
-
Locally, install and run a dependency audit (CI does the same on every push/PR):
uv sync uv pip install pip-audit uv run pip-audit
-
Whenever you change dependencies (including when applying Dependabot or automation), edit
pyproject.tomlfirst, then runuv lockanduv export --no-emit-package pyproject.toml -o requirements.txtso uv.lock and requirements.txt stay in sync with the lockfile. -
Local triage (Dependabot + image CVEs): On Windows, from the repo root, run
.\scripts\maintenance-check.ps1aftergh auth login(lists open Dependabot PRs) and with Docker Desktop if you wantdocker scout quickviewon the published image. It does not modify the repo. After fixing deps or the Dockerfile, rebuild and push the image, then re-run Scout on the new digest. The Dockerfile upgrades pip and wheel in both builder and runtime layers so scans do not flag stale tooling copied from old layers;requirements.txtis uv-exported and typically does not listwheelas an app dependency. -
Blocked-dependency triage checkpoint (last review: 2026-05-22):
uv.lockonmaincarriespygments2.20.0 andpyopenssl26.2.0 (viasnowflake-connector-python4.5.0 with optionalbigdata). Dependabot alerts #9 / #10 and CVE-2026-4539 should remain closed while those pins hold. Re-verify on each Band A order –1 pass and at least quarterly — see PLANS_TODO.md Quarterly blocked-dependency checkpoint. -
pyOpenSSL Dependabot alerts (#9 / #10) and Snowflake: Resolved as of 2026-05-22 (connector 4.5.0+). Historical cap and reopen steps: docs/ops/DEPENDABOT_PYOPENSSL_SNOWFLAKE.md.
-
Pygments Dependabot / pip-audit (CVE-2026-4539): Resolved as of 2026-05-22 (pygments 2.20.0 on PyPI). Historical triage and bump steps: docs/ops/DEPENDABOT_PYGMENTS_CVE.md.
-
Code scanning baseline: CodeQL workflow uses
security-and-qualityfor Python and should stay enabled on push/PR/schedule. Keep this broad suite plus project-specific hardening tests/rules; if a new query is noisy, triage and document before considering suppression. -
Semgrep (OSS): The Semgrep GitHub Actions workflow runs ruleset
p/pythonon push/PR (complements CodeQL). Exclusions and rationale:docs/plans/completed/PLAN_SEMGREP_CI.md. -
Bandit: Bandit (strict) runs as part of the CI workflow on push/PR (
[tool.bandit]inpyproject.toml). Details and low-severity triage:docs/plans/completed/PLAN_BANDIT_SECURITY_LINTER.md. -
CI workflow supply chain: Workflows under
.github/workflows/pin third-party GitHub Actions to full commit SHAs (version tag in YAML comments for humans). Theastral-sh/setup-uvstep pins a specific uv CLI semver—notlatest—so installs do not float silently between runs. Dependabot may propose SHA bumps; review upstream release notes before merge. This reduces tag-moving and unexpected action updates but is not a guarantee against zero-day compromise of a pinned commit, supply-chain attacks that pass review, or risks outside CI (for example local developer tooling). Seedocs/adr/ADR-0005-ci-github-actions-supply-chain-pins.md.
This approach is part of the project’s security baseline. For the full list of hardening measures and status, see docs/plans/completed/PLAN_SECURITY_HARDENING.md.
- SQL injection: Table and column names used in dynamic SQL (connectors) come from the database inspector (discover), not from user input. Identifiers are escaped per dialect: double-quote for SQLite/Postgres/Oracle (
"→""), backtick for MySQL (`→``). The local audit database (SQLite) uses SQLAlchemy ORM and parameterized queries only;session_idand other user-supplied values are never concatenated into raw SQL. Seetests/test_security.pyfor regression tests. - Path traversal:
session_idin API paths is validated with a strict pattern (alphanumeric and underscore, 12–64 chars) before use in file paths or lookups; invalid values return HTTP 400. Seeapi/routes.py_validate_session_idandtests/test_security.py. - Input validation (tenant/technician): Tenant and technician values (scan start body, session PATCH, config-driven scan) are validated for length and allowed characters (printable, no control chars), then sanitized before storage so reports and the dashboard never display unsanitized input. See
core/validation.pysanitize_tenant_technicianandtests/test_security.py. - Credential injection in connection URLs: User and password are URL-encoded when building database connection URLs (SQL connector, MongoDB connector) so that special characters (
@,:,/,#) in credentials do not break URL parsing or be misinterpreted as host/path. Seeconnectors/sql_connector.py_quote_userinfo/_build_url,connectors/mongodb_connector.pyconnect(), andtests/test_security.py(e.g.test_sql_connector_build_url_encodes_password_special_chars,test_mongodb_connector_uri_encodes_password_special_chars). - Config and serialization: YAML config is loaded with
yaml.safe_load(no arbitrary Python object deserialization). Seetests/test_security.pyfor a test that unsafe YAML tags are rejected. - Config endpoint exposure: When
api.require_api_keyis true, GET/configreturns 401 without a valid API key, so raw config (which may contain secrets) is not exposed. GET/configalways redacts secret values (passwords, API key, tokens, client_secret, etc.) before sending YAML to the browser, so the UI never displays or transmits plain secrets; on save, placeholders are merged with the current file so real secrets are not overwritten. Seeconfig/redact_config.pyandtests/test_security.py. - Config file and secrets: Restrict config file permissions (e.g.
chmod 600onconfig.yaml) so only trusted users can read it. Do not commitconfig.yamlor any file containing credentials to version control; useconfig.example.yaml(ordeploy/config.example.yaml) as a template and keep local config in.gitignore(the project ignoresconfig.yaml,config.local.yaml, and*.vault). If rootconfig.yamlwas ever committed, rungit rm --cached config.yamlso Git stops tracking it (file stays on disk); old commits may still contain the blob—rewrite history or rotate secrets if the repo was public. Prefer storing secrets in environment variables (e.g.pass_from_env,api_key_from_env) so the config file holds no plain secrets. A password manager (e.g. Bitwarden) is a good place for the operator to store copies of those secrets and rotate them; seedocs/ops/OPERATOR_SECRETS_BITWARDEN.md. Homelab reality (hostnames, LAN IPs, inventory): keep under gitignoreddocs/private/homelab/—seedocs/PRIVATE_OPERATOR_NOTES.md. Seedocs/USAGE.md(Configuration),CONTRIBUTING.md(public repo hygiene), anddocs/plans/PLAN_SECRETS_VAULT.md(Phase A). - Operator notifications (webhooks): Optional
notificationscan POST scan summaries to Slack, Teams, a generic URL (e.g. Signal bridge), or Telegram fields for legacy/third-party configs. The canonical maintainer policy is not to use Telegram for Data Boar—seedocs/ops/OPERATOR_NOTIFICATION_CHANNELS.md. Treat webhook URLs and bot tokens as secrets; use${ENV_VAR}in YAML or env-only wiring. Seedocs/USAGE.md(§5.1). Sends retry on transient failures (5xx / network); they do not replace TLS or network policy. - API bind vs API key: When starting
--web, if the resolved bind address is non-loopback (e.g.0.0.0.0) and an API key is not effectively configured, the process prints a stderr warning (seemain.pyandcore/host_resolution.py). Prefer loopback + reverse proxy, orapi.require_api_keywith a strong key in production. - Request body size limit: The API rejects requests whose Content-Length exceeds 1 MB (e.g. POST
/config, POST/scan, POST/scan_database) with HTTP 413 Payload Too Large to reduce DoS via huge JSON or form bodies. Seeapi/routes.pyrequest_body_size_middlewareandtests/test_security.pyfor regression tests. - Logging policy: API key, passwords, and connection strings must not appear in audit or application logs. Failure details and exception messages use
core.validation.sanitize_log_text(secrets plus detector-aligned PII shapes such as CPF, email, card runs) andcore.validation.clean_errorfor exceptions before SQLite or log handlers; connection URLs andpassword=/api_key=-style values are masked. Do not log raw config, request bodies, or driver/HTTP exception text that might contain credentials or customer data. Seecore/database.py(save_failure) andtests/test_security.py(test_redact_*,test_sanitize_log_text_*,test_clean_error_*). - Report and heatmap access: Report and heatmap endpoints validate
session_idformat before use; invalid IDs return 400, unknown or missing sessions return 404 (no session enumeration or 403/404 distinction for unknown IDs). Seeapi/routes.pyanddocs/SECURITY.md.
For a technician-oriented summary (what to watch for, regression tests, recommendations), see docs/SECURITY.md (EN) and docs/SECURITY.pt_BR.md (pt-BR). For completed and planned hardening steps, see docs/plans/completed/PLAN_SECURITY_HARDENING.md.
The application adds the following headers to all web and API responses by default:
- X-Content-Type-Options: nosniff – prevents MIME-type sniffing.
- X-Frame-Options: DENY – prevents the app from being embedded in frames (clickjacking mitigation).
- Content-Security-Policy – restricts script, style, and resource origins to the app and the Chart.js CDN; allows inline scripts/styles required by the current dashboard.
- Referrer-Policy: strict-origin-when-cross-origin – limits referrer information sent on cross-origin requests.
- Permissions-Policy – disables browser features not needed by the app (camera, microphone, geolocation, etc.).
- Strict-Transport-Security (HSTS) – set only when the request is considered HTTPS (direct or via
X-Forwarded-Proto: httpsfrom a trusted proxy), so HTTP-only deployments are not locked out. When present, it usesmax-age=31536000; includeSubDomains; preload.
When the app is behind a reverse proxy (e.g. nginx, Caddy, load balancer), ensure the proxy sets X-Forwarded-Proto: https for TLS-terminated requests so HSTS is applied correctly. Do not enable HSTS at the app layer for plain HTTP; the proxy can add HSTS when serving over HTTPS.
The API does not implement authentication by default; secure the app at the reverse proxy or network level when exposed. For enterprises that want a simple shared-secret gate without changing the “secure at proxy” model, the application supports an optional API key:
- In config, set
api.require_api_key: trueand eitherapi.api_key(literal — avoid committing secrets) orapi.api_key_from_env: "VAR"(read key from environment at startup). When enabled, GET /health stays unauthenticated on purpose: it returns liveness JSON (status, publiclicensesummary,dashboard_transport) for probes. Every other route must include X-API-Key or Authorization: Bearer <key> when a key is successfully resolved from config/env. 401 = missing or wrong key. 503 =require_api_keyis true but no key could be resolved (misconfiguration).main.py --webexits with code 2 before listening if the key is required but missing, so you do not accidentally run an open API. - Good practice: Use a strong, random key and store it in an environment variable (e.g.
api_key_from_env: "AUDIT_API_KEY"). Do not log the key or commit it to version control. This is a simple gate only; for full authentication and authorization, continue to use the reverse proxy or an identity provider. - Concrete operator steps (shell, systemd, Docker/K8s patterns,
curlchecks, synthetic example key): API_KEY_FROM_ENV_OPERATOR_STEPS.md. For ordering (inventory clients, staging first, monitordashboard_transportand audit export), see SECURE_BY_DEFAULT_BLOCKERS_AND_MIGRATION.md. - End-to-end technician guide (API key + TLS paths, Let’s Encrypt, lab self-signed, Docker): SECURE_DASHBOARD_AUTH_AND_HTTPS_HOWTO.md (pt-BR).
Security headers (including CSP) are implemented in api/routes.py (middleware applied to web and API responses). For operator-facing hardening (containers, reverse proxy, TLS, WAF), see docs/USAGE.md and docs/deploy/DEPLOY.md (Security and hardening). To harden container and cluster deployments:
- Docker and Kubernetes: See
docs/deploy/DEPLOY.md, section “Security and hardening (optional)”, for: - Running as non-root, resource limits, and healthchecks.
- Optional Kubernetes examples: securityContext (runAsNonRoot, readOnlyRootFilesystem, drop capabilities), NetworkPolicy (
deploy/kubernetes/network-policy.example.yaml), and PodDisruptionBudget (deploy/kubernetes/pdb.example.yaml).
When the API or dashboard is exposed to the internet or untrusted networks, run it behind a reverse proxy with TLS, proper authentication/authorization, and consider a WAF (web application firewall). The app’s API key and rate limiting (see docs/USAGE.md) complement but do not replace proxy-level security.
If you believe you have found a security vulnerability in this project:
- Do not open a public issue with exploit details.
- Instead, please:
- Open a new issue in the Issues tab with a short, high-level description (no sensitive PoC data), or
- If GitHub security advisories or private reporting is available for this repo, prefer that channel.
- Include at least:
- Version/commit of the project you are using.
- Python version and OS details.
- A minimal description of the impact (e.g. information disclosure, privilege escalation, DoS).
- The maintainers will:
- Acknowledge receipt as soon as reasonably possible.
- Investigate and, if confirmed, work on a fix and coordinate disclosure.
If you are unsure whether something is security-sensitive, err on the side of caution and use the private channel (or a minimal public issue) so we can triage it safely.
These are targets for maintainers and reporters, not contractual obligations. Adjust to your capacity.
| Area | Optional target |
|---|---|
| Vulnerability reports | We aim to acknowledge within 5 working days and, for high/critical findings, to fix or document (e.g. advisory, mitigation, or “won’t fix” with rationale) within 30 days. |
| Dependabot security PRs | We treat Dependabot security PRs as P0: aim to merge or respond (e.g. merge, close with comment, or defer with rationale) within 5 working days. Non-security dependency PRs follow the usual review cycle. |
See CONTRIBUTING for how to apply dependency updates and run pip-audit; see .github/dependabot.yml for Dependabot configuration.
Use this matrix to prioritize CodeQL findings by impact and release risk. Map each alert to the nearest rule family and code surface, then decide fix-now vs scheduled.
| Priority | Rule IDs (examples) | Typical code surface in this repo | Action |
|---|---|---|---|
| P0 (fix before release) | py/path-injection, py/sql-injection, py/nosql-injection, py/code-injection, py/command-line-injection, py/template-injection, py/full-ssrf, py/unsafe-deserialization, py/xxe |
API file serving and report paths (api/routes.py), connectors/query builders (connectors/*, database/*), config/load paths (config/*) |
Patch immediately, add regression test, and verify with CodeQL rerun. |
| P1 (fix in current cycle) | py/weak-sensitive-data-hashing, py/clear-text-logging-sensitive-data, py/clear-text-storage-sensitive-data, py/insecure-protocol, py/insecure-default-protocol, py/url-redirection, py/regex-injection, py/redos |
Licensing/integrity helpers (core/licensing/*), logging and failure persistence (core/database.py, core/validation.py), network connectors (connectors/*) |
Fix or document mitigation in this release cycle; add tests where practical. |
| P2 (scheduled hardening / monitor) | py/bind-socket-all-network-interfaces, py/flask-debug, py/client-exposed-cookie, py/insecure-cookie, py/samesite-none-cookie, py/stack-trace-exposure, py/use-of-input |
Host/runtime settings (core/host_resolution.py, Docker defaults), web middleware/templates (api/routes.py, api/templates/*) |
Keep enabled, monitor trends, and batch low-risk fixes with maintenance PRs. |
Notes:
- Do not disable broad suites by default. Keep
security-and-qualityand use targeted code fixes + tests first. - If you must defer a finding, record reason + compensating control in PR/issue and revisit next
-1/-1bloop.