Skip to content

Setounkpe7/threat-intel-api

threat-intel-api

threat-intel-api

A sector-aware OSINT vulnerability intelligence API.
Aggregates NVD, CISA KEV and GitHub Advisories, scores each CVE against per-sector profiles, and exposes the result as JSON, RSS and a Swagger UI.

security gate coverage trivy python image size OWASP API Top 10 Railway license

Live API · Swagger UI · Finance dashboard (sample)


See it in 30 seconds

Three curl calls against the live threat-intel-api: top finance threats, RSS feed, score breakdown

Three curl calls against the live API: top finance threats, RSS subscription, audit-grade score breakdown. Rendered with VHS from docs/assets/hero-demo.tape.

The three calls below are the same ones the GIF runs.

Top 5 finance-relevant threats, last 24h

curl -s 'https://threat-intel-api-production.up.railway.app/api/v1/sectors/finance/dashboard' \
  | jq '.top_24h[:5] | .[] | {cve: .external_id, score, cvss: .cvss_score, title: .title[0:80]}'
{
  "cve": "CVE-2026-7579",
  "score": 35.0,
  "cvss": 7.3,
  "title": "Hard-coded credentials in AstrBot dashboard auth (CWE-798)"
}

Subscribe a SIEM to the finance RSS feed

curl 'https://threat-intel-api-production.up.railway.app/api/v1/sectors/finance/feed.rss?min_score=70'

Inspect the score breakdown of a single CVE

curl -s 'https://threat-intel-api-production.up.railway.app/api/v1/sectors/finance/dashboard' \
  | jq '.top_24h[0].score_breakdown'
{
  "cwe_match":        { "hit": true,  "matched": ["CWE-798"], "points": 20 },
  "cvss_threshold":   { "hit": true,  "threshold": 7.0, "cvss_score": 7.3, "points": 15 },
  "kev":              { "hit": false, "points": 0 },
  "technology_match": { "hit": false, "matched": [], "points": 0 }
}

Excerpt; the full breakdown also exposes keyword_match, priority_boost, excluded, actively_exploited, ransomware, multi_source, package_match, plus raw_total and final_score. Every score is auditable: you can always see why a CVE landed where it did.


Why it exists

A bank's security team and a hospital's IT team do not need the same threat feed, but they usually get the same one. NVD publishes around 150 CVEs a day. CISA KEV adds the subset attackers are actively using. GitHub Advisories covers the open-source supply chain. None of those feeds is wrong; none is targeted either.

This API ingests the three sources, deduplicates across them, and scores each CVE against a YAML-defined sector profile: keywords, technology stack, weighted CWEs, exclusions, CVSS threshold. A finance team gets a feed weighted on payment rails and authentication. An industrial team gets one weighted on PLC vendors and ICS protocols.

A SIEM, a SOAR, or a human analyst can consume the result without tuning the API itself, because all the tuning sits in version-controlled YAML.


Features

Multi-source ingestion NVD, CISA KEV and GitHub Advisories with cross-feed deduplication
Sector-aware scoring 6 public profiles (finance, healthcare, ICS, gov, SaaS, e-commerce)
Async pipeline httpx + APScheduler, field-level priority on conflicting sources
SIEM-ready feeds JSON, RSS 2.0, paginated and filterable
Hardened by default OWASP API Top 10, ASVS L2, CIS Docker; details in SECURITY.md
Hot-reload profiles drop a YAML, POST /admin/reload-profiles, picked up without restart
Auditable scoring every score includes a per-criterion breakdown
Per-sector dashboards top 24h, top 7d, aggregate stats per profile

Architecture

flowchart LR
    subgraph Sources["OSINT sources"]
        direction TB
        NVD["NVD<br/>REST / JSON"]
        KEV["CISA KEV<br/>REST / JSON"]
        GHSA["GitHub Advisories<br/>GraphQL"]
    end

    subgraph Pipeline["Ingestion &amp; scoring"]
        direction TB
        Ingest["IngestService<br/>cross-source dedup"]
        YAML["profiles/*.yaml<br/>hot-reloadable"]
        Loader["SectorProfileLoader"]
        Scoring["SectorScoringService"]
    end

    subgraph Storage["PostgreSQL"]
        direction TB
        DB[("threat,<br/>threat_source,<br/>threat_indicator")]
        Scores[("threat_sector_score")]
    end

    API["FastAPI v1"]
    Consumers(["SIEM / SOAR / analyst"])

    NVD -->|"httpx, 60 min"| Ingest
    KEV -->|"httpx, 6 h"| Ingest
    GHSA -->|"httpx, 2 h"| Ingest
    Ingest -->|"SQLAlchemy 2.0"| DB
    YAML --> Loader
    Loader --> Scoring
    DB --> Scoring
    Scoring --> Scores
    DB --> API
    Scores --> API
    API --> Consumers
Loading

Detailed component docs: docs/COLLECTORS.md, docs/SCORING.md, docs/INDICATORS.md.


Stack

Layer Tools
Runtime Python 3.12, FastAPI, async/await, Uvicorn
Data PostgreSQL 16, SQLAlchemy 2.0, Alembic
Ingestion httpx, respx (tests), APScheduler
Validation Pydantic v2, RFC 7807 problem+json
DevSecOps Bandit, Semgrep, Ruff, mypy strict, pip-audit, Trivy, Hadolint, Gitleaks, CycloneDX SBOM
Observability structlog (JSON logs, PII scrubbing), Sentry, /health
Container Multi-stage Alpine, distroless-style runtime, non-root uid 1001, ~195 MB
CI/CD GitHub Actions (7-job security gate), branch-protected main, Railway deploy on merge

Quick start

git clone https://github.com/Setounkpe7/threat-intel-api.git
cd threat-intel-api
cp .env.example .env   # optional — fill in ADMIN_API_KEY / GITHUB_TOKEN to unlock admin endpoints and GHSA
docker compose up --build

Then curl localhost:8000/health and open http://localhost:8000/docs.

After the stack is up, optionally seed demo data so the sector dashboards have something to show:

docker compose exec app python scripts/demo_seed.py

Full setup (local Python venv, SQLite mode, env reference, troubleshooting) lives in docs/INSTALLATION.md.

API guide with worked examples: docs/API_USAGE.md.


Sector profiles

Each profile is one YAML file. Drop it in profiles/public/ and hit the reload endpoint. The new dashboard, threats endpoint and RSS feed appear without restarting the API or running a migration.

# profiles/public/finance.yaml (excerpt)
id: finance
name: Finance & Banking
sector: financial-services
keywords: [swift, iso20022, pci-dss, banking, payment-rail]
technologies: [oracle-database, ibm-mq, kafka, kubernetes]
cwe_priorities:
  CWE-798: 20   # hard-coded credentials
  CWE-89:  18   # SQL injection
  CWE-287: 16   # improper authentication
cvss_threshold: 7.0
priority_boost: [kev, actively-exploited]

Schema and worked examples: profiles/README.md, scoring algorithm: docs/SCORING.md.


Security & DevSecOps

The CI security gate runs on every PR and every job below is blocking. As of this writing the gate spans nine sub-jobs plus an umbrella status check on main and dev.

Static analysis (SAST)

  • Bandit — Python AST audit (CWE coverage tuned for web)
  • Semgrep — pattern rules including OWASP Top 10
  • Ruff with security ruleset
  • mypy — strict mode on src/

Secret scanning

  • Gitleaks — scans the full git history on every PR; fails the gate on any committed credential, key, or token

Dependency security (SCA)

  • pip-audit — runtime CVE scan against requirements.lock
  • Dependabot — weekly updates, grouped, auto-merged on green
  • CycloneDX SBOM — generated and attached to release artifacts

Container security

  • Multi-stage build, runtime image without pip / setuptools / wheel
  • Non-root user (uid 1001) by default
  • Hadolint lints the Dockerfile in CI
  • Trivy image scan; CRITICAL+HIGH count must be 0
  • Image is published to ghcr.io/setounkpe7/threat-intel-api on merge to main, signed with Sigstore cosign (keyless, OIDC-bound to this repo's GitHub Actions identity), and shipped with a CycloneDX SBOM and SLSA L2 build-provenance attestation

Verifying a published image

Any consumer can cryptographically verify the image, the SBOM, and the build provenance:

IMAGE=ghcr.io/setounkpe7/threat-intel-api:sha-<short-sha>
IDENTITY_REGEX='^https://github\.com/Setounkpe7/threat-intel-api/\.github/workflows/release\.yml@refs/heads/main$'
OIDC_ISSUER=https://token.actions.githubusercontent.com

cosign verify "${IMAGE}" \
  --certificate-identity-regexp "${IDENTITY_REGEX}" \
  --certificate-oidc-issuer "${OIDC_ISSUER}"

cosign verify-attestation --type cyclonedx "${IMAGE}" \
  --certificate-identity-regexp "${IDENTITY_REGEX}" \
  --certificate-oidc-issuer "${OIDC_ISSUER}"

gh attestation verify oci://${IMAGE} --repo Setounkpe7/threat-intel-api

All three must exit 0. See docs/RELEASING.md for the rollback procedure and the one-time GHCR setup.

Runtime security

  • Security headers middleware (CSP, HSTS, frame-deny, content-type-options)
  • Per-IP rate limiter, configurable, with Retry-After
  • Structured JSON logs with PII scrubbing
  • Sentry for error capture, environment-tagged
  • Errors served as RFC 7807 application/problem+json (no stack traces leaked)

Compliance & standards

  • OWASP API Security Top 10 — controls mapped per item in docs/SECURITY.md
  • OWASP ASVS Level 2 — gap analysis tracked in repo
  • NIST SSDF — practices PO/PS/PW/RV mapped to repo features
  • CIS Docker Benchmark — Dockerfile reviewed against the relevant items
  • Coordinated disclosure policy — see SECURITY.md

Project metrics

Snapshot from the live deployment (2026-05-01):

Metric Value
Total CVEs in database 27,695
New CVEs ingested last 24h 273
Sources integrated 3 (NVD + CISA KEV active in production, GHSA in stabilization)
Sector profiles available 6 public
Endpoints 18 (13 public + 5 admin)
Python LOC (src/) ~4,550
Tests 291 (unit + integration + security)
Test coverage gate ≥ 80% (currently ~88%)
Container image 199 MB Alpine, non-root
/health latency (remote client) p50 ≈ 310 ms · p95 ≈ 410 ms (100 samples, includes DB round-trip)

Roadmap

  • M1 — NVD ingestion, schema, core API
  • M2 — Sector-aware scoring, dashboards, RSS, hot-reload, admin endpoints, rate limiting
  • M3 — Production deploy on Railway with security gate
  • M3a — Multi-source ingestion (CISA KEV + GHSA), cross-source dedup, IOC lookups
  • M3b — GHSA collector stabilization, additional IOC source types
  • M4 — Webhook alerting on critical threats
  • M5 — STIX 2.1 / TAXII export
  • M6 — NLP-based indicator extraction (spaCy)
  • M7 — Public web dashboard frontend

Contributing

Issues and PRs are welcome. Conventions, dev setup, and commit format: CONTRIBUTING.md. Behavior: CODE_OF_CONDUCT.md. Release log: CHANGELOG.md.

make lint typecheck test    # the same gates CI runs on a PR
make security-audit         # bandit + pip-audit + semgrep, locally

Author

Built by Michel-Ange Doubogan (cybersecurity, Python). LinkedIn · Portfolio case study


License & acknowledgments

Licensed under MIT. See LICENSE.

Threat intelligence data is sourced from public OSINT feeds: the National Vulnerability Database (NIST), the CISA Known Exploited Vulnerabilities catalog, and the GitHub Security Advisory Database. This project is not affiliated with any of them. CVE® is a registered trademark of MITRE.

Releases

No releases published

Packages

 
 
 

Contributors

Languages