A sector-aware OSINT vulnerability intelligence API.
Aggregates NVD, CISA KEV and GitHub Advisories, scores each CVE against
per-sector profiles, and exposes the result as JSON, RSS and a Swagger UI.
Live API · Swagger UI · Finance dashboard (sample)
Three curl calls against the live API: top finance threats, RSS subscription, audit-grade score breakdown. Rendered with VHS from docs/assets/hero-demo.tape.
The three calls below are the same ones the GIF runs.
curl -s 'https://threat-intel-api-production.up.railway.app/api/v1/sectors/finance/dashboard' \
| jq '.top_24h[:5] | .[] | {cve: .external_id, score, cvss: .cvss_score, title: .title[0:80]}'{
"cve": "CVE-2026-7579",
"score": 35.0,
"cvss": 7.3,
"title": "Hard-coded credentials in AstrBot dashboard auth (CWE-798)"
}curl 'https://threat-intel-api-production.up.railway.app/api/v1/sectors/finance/feed.rss?min_score=70'curl -s 'https://threat-intel-api-production.up.railway.app/api/v1/sectors/finance/dashboard' \
| jq '.top_24h[0].score_breakdown'{
"cwe_match": { "hit": true, "matched": ["CWE-798"], "points": 20 },
"cvss_threshold": { "hit": true, "threshold": 7.0, "cvss_score": 7.3, "points": 15 },
"kev": { "hit": false, "points": 0 },
"technology_match": { "hit": false, "matched": [], "points": 0 }
}Excerpt; the full breakdown also exposes keyword_match, priority_boost,
excluded, actively_exploited, ransomware, multi_source, package_match,
plus raw_total and final_score. Every score is auditable: you can always see
why a CVE landed where it did.
A bank's security team and a hospital's IT team do not need the same threat feed, but they usually get the same one. NVD publishes around 150 CVEs a day. CISA KEV adds the subset attackers are actively using. GitHub Advisories covers the open-source supply chain. None of those feeds is wrong; none is targeted either.
This API ingests the three sources, deduplicates across them, and scores each CVE against a YAML-defined sector profile: keywords, technology stack, weighted CWEs, exclusions, CVSS threshold. A finance team gets a feed weighted on payment rails and authentication. An industrial team gets one weighted on PLC vendors and ICS protocols.
A SIEM, a SOAR, or a human analyst can consume the result without tuning the API itself, because all the tuning sits in version-controlled YAML.
| Multi-source ingestion | NVD, CISA KEV and GitHub Advisories with cross-feed deduplication |
| Sector-aware scoring | 6 public profiles (finance, healthcare, ICS, gov, SaaS, e-commerce) |
| Async pipeline | httpx + APScheduler, field-level priority on conflicting sources |
| SIEM-ready feeds | JSON, RSS 2.0, paginated and filterable |
| Hardened by default | OWASP API Top 10, ASVS L2, CIS Docker; details in SECURITY.md |
| Hot-reload profiles | drop a YAML, POST /admin/reload-profiles, picked up without restart |
| Auditable scoring | every score includes a per-criterion breakdown |
| Per-sector dashboards | top 24h, top 7d, aggregate stats per profile |
flowchart LR
subgraph Sources["OSINT sources"]
direction TB
NVD["NVD<br/>REST / JSON"]
KEV["CISA KEV<br/>REST / JSON"]
GHSA["GitHub Advisories<br/>GraphQL"]
end
subgraph Pipeline["Ingestion & scoring"]
direction TB
Ingest["IngestService<br/>cross-source dedup"]
YAML["profiles/*.yaml<br/>hot-reloadable"]
Loader["SectorProfileLoader"]
Scoring["SectorScoringService"]
end
subgraph Storage["PostgreSQL"]
direction TB
DB[("threat,<br/>threat_source,<br/>threat_indicator")]
Scores[("threat_sector_score")]
end
API["FastAPI v1"]
Consumers(["SIEM / SOAR / analyst"])
NVD -->|"httpx, 60 min"| Ingest
KEV -->|"httpx, 6 h"| Ingest
GHSA -->|"httpx, 2 h"| Ingest
Ingest -->|"SQLAlchemy 2.0"| DB
YAML --> Loader
Loader --> Scoring
DB --> Scoring
Scoring --> Scores
DB --> API
Scores --> API
API --> Consumers
Detailed component docs: docs/COLLECTORS.md, docs/SCORING.md, docs/INDICATORS.md.
| Layer | Tools |
|---|---|
| Runtime | Python 3.12, FastAPI, async/await, Uvicorn |
| Data | PostgreSQL 16, SQLAlchemy 2.0, Alembic |
| Ingestion | httpx, respx (tests), APScheduler |
| Validation | Pydantic v2, RFC 7807 problem+json |
| DevSecOps | Bandit, Semgrep, Ruff, mypy strict, pip-audit, Trivy, Hadolint, Gitleaks, CycloneDX SBOM |
| Observability | structlog (JSON logs, PII scrubbing), Sentry, /health |
| Container | Multi-stage Alpine, distroless-style runtime, non-root uid 1001, ~195 MB |
| CI/CD | GitHub Actions (7-job security gate), branch-protected main, Railway deploy on merge |
git clone https://github.com/Setounkpe7/threat-intel-api.git
cd threat-intel-api
cp .env.example .env # optional — fill in ADMIN_API_KEY / GITHUB_TOKEN to unlock admin endpoints and GHSA
docker compose up --buildThen curl localhost:8000/health and open http://localhost:8000/docs.
After the stack is up, optionally seed demo data so the sector dashboards have something to show:
docker compose exec app python scripts/demo_seed.pyFull setup (local Python venv, SQLite mode, env reference, troubleshooting) lives in docs/INSTALLATION.md.
API guide with worked examples: docs/API_USAGE.md.
Each profile is one YAML file. Drop it in profiles/public/ and hit the reload endpoint. The new dashboard, threats endpoint and RSS feed appear without restarting the API or running a migration.
# profiles/public/finance.yaml (excerpt)
id: finance
name: Finance & Banking
sector: financial-services
keywords: [swift, iso20022, pci-dss, banking, payment-rail]
technologies: [oracle-database, ibm-mq, kafka, kubernetes]
cwe_priorities:
CWE-798: 20 # hard-coded credentials
CWE-89: 18 # SQL injection
CWE-287: 16 # improper authentication
cvss_threshold: 7.0
priority_boost: [kev, actively-exploited]Schema and worked examples: profiles/README.md, scoring algorithm: docs/SCORING.md.
The CI security gate runs on every PR and every job below is blocking. As of this writing the gate spans nine sub-jobs plus an umbrella status check on main and dev.
- Bandit — Python AST audit (CWE coverage tuned for web)
- Semgrep — pattern rules including OWASP Top 10
- Ruff with security ruleset
- mypy — strict mode on
src/
- Gitleaks — scans the full git history on every PR; fails the gate on any committed credential, key, or token
- pip-audit — runtime CVE scan against
requirements.lock - Dependabot — weekly updates, grouped, auto-merged on green
- CycloneDX SBOM — generated and attached to release artifacts
- Multi-stage build, runtime image without
pip/setuptools/wheel - Non-root user (uid 1001) by default
- Hadolint lints the Dockerfile in CI
- Trivy image scan; CRITICAL+HIGH count must be 0
- Image is published to
ghcr.io/setounkpe7/threat-intel-apion merge tomain, signed with Sigstore cosign (keyless, OIDC-bound to this repo's GitHub Actions identity), and shipped with a CycloneDX SBOM and SLSA L2 build-provenance attestation
Any consumer can cryptographically verify the image, the SBOM, and the build provenance:
IMAGE=ghcr.io/setounkpe7/threat-intel-api:sha-<short-sha>
IDENTITY_REGEX='^https://github\.com/Setounkpe7/threat-intel-api/\.github/workflows/release\.yml@refs/heads/main$'
OIDC_ISSUER=https://token.actions.githubusercontent.com
cosign verify "${IMAGE}" \
--certificate-identity-regexp "${IDENTITY_REGEX}" \
--certificate-oidc-issuer "${OIDC_ISSUER}"
cosign verify-attestation --type cyclonedx "${IMAGE}" \
--certificate-identity-regexp "${IDENTITY_REGEX}" \
--certificate-oidc-issuer "${OIDC_ISSUER}"
gh attestation verify oci://${IMAGE} --repo Setounkpe7/threat-intel-apiAll three must exit 0. See docs/RELEASING.md for
the rollback procedure and the one-time GHCR setup.
- Security headers middleware (CSP, HSTS, frame-deny, content-type-options)
- Per-IP rate limiter, configurable, with
Retry-After - Structured JSON logs with PII scrubbing
- Sentry for error capture, environment-tagged
- Errors served as RFC 7807
application/problem+json(no stack traces leaked)
- OWASP API Security Top 10 — controls mapped per item in
docs/SECURITY.md - OWASP ASVS Level 2 — gap analysis tracked in repo
- NIST SSDF — practices PO/PS/PW/RV mapped to repo features
- CIS Docker Benchmark — Dockerfile reviewed against the relevant items
- Coordinated disclosure policy — see
SECURITY.md
Snapshot from the live deployment (2026-05-01):
| Metric | Value |
|---|---|
| Total CVEs in database | 27,695 |
| New CVEs ingested last 24h | 273 |
| Sources integrated | 3 (NVD + CISA KEV active in production, GHSA in stabilization) |
| Sector profiles available | 6 public |
| Endpoints | 18 (13 public + 5 admin) |
Python LOC (src/) |
~4,550 |
| Tests | 291 (unit + integration + security) |
| Test coverage gate | ≥ 80% (currently ~88%) |
| Container image | 199 MB Alpine, non-root |
/health latency (remote client) |
p50 ≈ 310 ms · p95 ≈ 410 ms (100 samples, includes DB round-trip) |
- M1 — NVD ingestion, schema, core API
- M2 — Sector-aware scoring, dashboards, RSS, hot-reload, admin endpoints, rate limiting
- M3 — Production deploy on Railway with security gate
- M3a — Multi-source ingestion (CISA KEV + GHSA), cross-source dedup, IOC lookups
- M3b — GHSA collector stabilization, additional IOC source types
- M4 — Webhook alerting on critical threats
- M5 — STIX 2.1 / TAXII export
- M6 — NLP-based indicator extraction (spaCy)
- M7 — Public web dashboard frontend
Issues and PRs are welcome. Conventions, dev setup, and commit format: CONTRIBUTING.md. Behavior: CODE_OF_CONDUCT.md. Release log: CHANGELOG.md.
make lint typecheck test # the same gates CI runs on a PR
make security-audit # bandit + pip-audit + semgrep, locallyBuilt by Michel-Ange Doubogan (cybersecurity, Python). LinkedIn · Portfolio case study
Licensed under MIT. See LICENSE.
Threat intelligence data is sourced from public OSINT feeds: the National Vulnerability Database (NIST), the CISA Known Exploited Vulnerabilities catalog, and the GitHub Security Advisory Database. This project is not affiliated with any of them. CVE® is a registered trademark of MITRE.
