Hybrid Terraform Security Scanner — Deterministic Rules + ML Anomaly Detection
Catch Terraform misconfigurations, hardcoded secrets, and risky infrastructure patterns before they reach production. TerraVault pairs 11 deterministic detection rules with Isolation Forest anomaly detection to surface both known violations and deviations from learned secure baselines.
- Hybrid scoring — 60% rule-based + 40% ML anomaly detection. Deterministic rules for known risks, Isolation Forest for everything else
- Fast enough for CI gating — sub-second per-file scans — no meaningful pipeline latency
- Operable API — FastAPI with bcrypt API keys, Redis rate limiting, async I/O, Prometheus metrics, correlation IDs
- Measured quality — 72 focused pytest cases, 74% line coverage (1,518 SLOC), Pylint 10.00/10, 0 Flake8 issues, 0 Bandit findings, 0 Safety advisories
- Features
- Quick Start
- Architecture
- CLI Usage
- REST API
- Quality Metrics
- DevSecOps Pipeline
- Docker Deployment
- Monitoring & Observability
- Technology Stack
- Screenshots
- Academic Context
- Limitations & Future Work
- References
- License
- Pattern matching for 11 vulnerability categories: open ports, hardcoded secrets, unencrypted storage, public S3 buckets, IAM misconfigurations, missing CloudWatch logging, missing VPC flow logs, publicly accessible RDS, unrestricted security-group egress, EC2 instances allowing IMDSv1, and EC2 instances with auto-assigned public IPs
- Severity classification:
CRITICAL·HIGH·MEDIUM·LOW·INFO - Actionable remediation suggestions per finding
- Configurable severity overrides for organizational policy alignment
- Isolation Forest anomaly detection (unsupervised — no labeled data required)
- 8-dimensional structural feature vector extracted directly from the parsed Terraform — independent of the rule findings — so the model reacts to risky infrastructure shapes the fixed rules don't encode: resource count, type diversity, ingress-rule count, public-exposure count, IAM-resource count, encryption coverage, logging-resource count, secret parametrization
- Trained on a synthetic-but-principled corpus of secure-infrastructure profiles where every feature varies and the secure mode is centered, so a fully-encrypted/parametrized config sits inside the learned manifold and insecure deviations score as anomalies
- Model persistence via Joblib with versioning and drift detection
- Confidence scoring based on anomaly distance from learned security baselines
- FastAPI with OpenAPI/Swagger docs at
/docs - Bcrypt-hashed API key authentication
- Redis-backed caching and rate limiting (with in-memory fallback)
- Async file processing with configurable timeouts
- Prometheus metrics at
/metrics - Correlation ID tracing for all requests
- GitHub Actions CI/CD with 5-stage pipeline
- SAST (Bandit), dependency scanning (Safety), secret detection (GitLeaks)
- Docker image security scan (Trivy)
- Pre-commit hooks for local development
- SBOM generation (CycloneDX)
- Python 3.10+
- Git
# Clone the repository
git clone https://github.com/oguarni/terravault.git
cd terravault
# Install everything (creates venv, installs deps)
make install# Scan all three test configurations
make demo
# Or scan a specific file
python -m terravault.cli test_files/vulnerable.tf
python -m terravault.cli test_files/secure.tf
python -m terravault.cli test_files/mixed.tfmake test # All tests
make coverage # With coverage report
make lint # Code quality (Pylint + Flake8)
make security-scan # Bandit SAST + Safety dependency checkFor full API setup with Docker, database, and monitoring, see the Quick Start Guide.
TerraVault follows Clean Architecture with strict layer separation:
terravault/
├── domain/ # Business rules, severity levels, vulnerability models
├── application/ # Use cases — IntelligentSecurityScanner orchestrator
├── infrastructure/ # Adapters — HCL parser, ML model, database, cache
├── config/ # Settings (Pydantic), structured logging
├── cli.py # Command-line interface (text/json/sarif output)
├── api.py # FastAPI REST server
└── metrics.py # Prometheus instrumentation
graph TD
A[Terraform .tf File] --> B[HCL2 Parser]
B --> C[Feature Extraction Engine]
C --> D[Rule-based Detection]
C --> E[ML Feature Vectorization]
D --> F[Pattern Matching<br>7 vulnerability categories]
E --> G[Isolation Forest<br>Anomaly Detection]
F --> H[Risk Score Aggregator<br>0.6 x Rules + 0.4 x ML]
G --> H
H --> I[Scan Report<br>Score · Vulnerabilities · Confidence]
style C fill:#e1f5ff,stroke:#0288d1,stroke-width:2px,color:#01579b
style H fill:#fff3e0,stroke:#f57c00,stroke-width:2px,color:#e65100
style I fill:#e8f5e9,stroke:#388e3c,stroke-width:2px,color:#1b5e20
| Weight | Component | Method |
|---|---|---|
| 60% | Rule-based | Deterministic pattern matching — CRITICAL (30pts), HIGH (20pts), MEDIUM (10pts), LOW (5pts), INFO (2pts) |
| 40% | ML Anomaly | Isolation Forest deviation from learned security baseline |
Score ranges: 0-30 Secure · 31-60 Review recommended · 61-100 Critical action required
# Scan a Terraform file
python -m terravault.cli <path-to-file.tf>
# Scan via Makefile
make scan FILE=test_files/vulnerable.tf
# JSON output for CI integration
python -m terravault.cli --output-format json --threshold 50 file1.tf file2.tf
# SARIF output for GitHub Code Scanning
python -m terravault.cli --output-format sarif file.tfTerraVault - Intelligent Terraform Security Scanner
Using hybrid approach: Rules (60%) + ML Anomaly Detection (40%)
============================================================
TERRAFORM SECURITY SCAN RESULTS
============================================================
File: test_files/vulnerable.tf
HIGH RISK
Final Risk Score: 81/100
Rule-based Score: 100/100
ML Anomaly Score: 54.7/100
Confidence: LOW
Detected Vulnerabilities:
[CRITICAL] Open security group - SSH port 22 exposed to internet
Resource: web_sg
Fix: Restrict SSH access to specific IP ranges
[CRITICAL] Hardcoded password detected
Resource: Database/Instance
Fix: Use variables or secrets manager for sensitive data
[HIGH] Unencrypted RDS instance
Resource: main_db
Fix: Enable storage_encrypted = true
[HIGH] Unencrypted EBS volume
Resource: data_volume
Fix: Enable encrypted = true
[HIGH] S3 bucket with public access enabled
Resource: public_bucket
Fix: Enable all public access blocks
LOW RISK
Final Risk Score: 18/100
Rule-based Score: 0/100
ML Anomaly Score: 46.0/100
Confidence: LOW
No security issues detected!
All resources properly configured
Encryption enabled where required
Network access properly restricted
# Local development
make api
# Production (Docker)
docker-compose up -d| Method | Endpoint | Auth | Description |
|---|---|---|---|
GET |
/health |
No | Health check with DB and rate limiter status |
POST |
/scan |
API Key | Scan a Terraform file (rate limited: 10/min) |
GET |
/metrics |
No | Prometheus metrics |
GET |
/docs |
No | OpenAPI/Swagger UI |
curl -X POST \
-H "X-API-Key: YOUR_API_KEY" \
-F "file=@terraform.tf" \
http://localhost:8000/scanimport requests
response = requests.post(
"http://localhost:8000/scan",
headers={"X-API-Key": "YOUR_API_KEY"},
files={"file": open("terraform.tf", "rb")}
)
print(response.json()){
"file": "vulnerable.tf",
"score": 85,
"rule_based_score": 90,
"ml_score": 75.5,
"confidence": "HIGH",
"vulnerabilities": [
{
"severity": "CRITICAL",
"points": 20,
"message": "Hardcoded AWS credentials detected",
"resource": "aws_instance.web",
"remediation": "Use AWS IAM roles or environment variables"
}
],
"summary": { "critical": 1, "high": 2, "medium": 0, "low": 0 },
"performance": { "scan_time_seconds": 0.234, "file_size_kb": 1.5, "from_cache": false }
}Generate API keys with
python scripts/generate_api_key.py. See QUICKSTART.md for full API setup.
All metrics from the latest full local run — April 16, 2026.
| Category | Metric | Result |
|---|---|---|
| Testing | Test suite | 72 tests — 72 passed, 0 skipped |
| Testing | Code coverage | 74.11% across 24 measured modules (1,518 statements) |
| Code Quality | Pylint score | 10.00 / 10 |
| Code Quality | Flake8 | 0 issues |
| Code Quality | Codebase size | 1,518 measured statements (3,352 non-blank lines) |
| Security | SAST (Bandit) | 0 issues — 0 High, 0 Medium, 0 Low |
| Security | Dependencies (Safety) | 0 vulnerabilities |
GitHub Actions pipeline with 5 stages:
graph LR
A[Security Scan] --> B[Unit Tests]
B --> C[Integration Scan]
B --> D[Docker Build + Trivy]
C --> E[Deploy Staging]
D --> E
style A fill:#ffebee,stroke:#c62828,color:#b71c1c
style B fill:#e3f2fd,stroke:#1565c0,color:#0d47a1
style C fill:#e8f5e9,stroke:#2e7d32,color:#1b5e20
style D fill:#fff3e0,stroke:#ef6c00,color:#e65100
style E fill:#f3e5f5,stroke:#7b1fa2,color:#4a148c
| Stage | Tool | Purpose |
|---|---|---|
| SAST | Bandit | Static code analysis for Python security issues |
| Dependencies | Safety | Known vulnerability check for all pip packages |
| Secrets | GitLeaks | Detect hardcoded secrets and credentials |
| Container | Trivy | Docker image vulnerability scanning |
| Coverage | Codecov | Test coverage tracking and reporting |
make security-scan # Run all security checks
make security-deps # Dependency vulnerabilities only
make security-sast # SAST only
make setup-hooks # Install pre-commit hooks# Build and scan
docker build -t terravault:latest .
docker run --rm -v /path/to/terraform:/scan:ro terravault:latest /scan/main.tfdocker-compose up -d| Service | Port | Purpose |
|---|---|---|
| terravault-api | 8000 | FastAPI application |
| PostgreSQL | 5432 | Persistent scan storage |
| Redis | 6379 | Caching and rate limiting |
| Prometheus | 9090 | Metrics collection |
| Grafana | 3000 | Dashboards and visualization |
The Docker image runs as a non-root user with --read-only filesystem and --security-opt=no-new-privileges recommended.
- Prometheus scrapes
/metricsevery 10s — scan rates, cache hits, latencies, error rates - Grafana dashboard (
TerraVault Overview) with pre-configured panels:- Scan rate and cache hit ratio
- Vulnerability distribution by severity and category
- P95/P99 scan duration
- API request latency and error rates
- Structured JSON logging with correlation IDs for request tracing
- Health check endpoint at
/healthwith database connectivity status
| Component | Technology | Purpose |
|---|---|---|
| Language | Python 3.10+ | ML ecosystem, clean syntax |
| ML Framework | scikit-learn (Isolation Forest) | Unsupervised anomaly detection |
| Parser | python-hcl2 | Native Terraform HCL2 parsing |
| API Framework | FastAPI + Uvicorn | Async REST API with OpenAPI docs |
| Database | PostgreSQL + SQLAlchemy (async) | Scan history persistence |
| Cache | Redis | LRU caching, rate limiting |
| Auth | bcrypt | API key hashing |
| Monitoring | Prometheus + Grafana | Metrics and dashboards |
| Containers | Docker + Docker Compose | Multi-service deployment |
| CI/CD | GitHub Actions | DevSecOps automation |
| Numerical | NumPy | Feature vector operations |
| Model Persistence | Joblib | Serialized scikit-learn models |
| Course | Capstone Project I and II |
| Institution | Federal University of Technology - Parana (UTFPR) |
| Program | B.S. in Software Engineering, 8th Semester |
| Type | Technical Report |
Isolation Forest was selected after evaluating alternatives against four practical criteria: suitability for unlabeled data (labeled Terraform misconfiguration datasets are scarce), efficiency on structured configuration inputs, performance with limited training samples, and output interpretability.
| Criterion | Isolation Forest | Neural Networks | Genetic Algorithms | Decision Trees |
|---|---|---|---|---|
| Unsupervised (no labels) | Strong | Weak | N/A | Weak |
| Efficient on structured data | Strong | Overkill | Misaligned | Moderate |
| Small-sample performance | Strong | Weak | Moderate | Moderate |
| Explainable output | Strong | Weak | Moderate | Strong |
- Hybrid detection — Deterministic rules catch known misconfigurations with zero false negatives against their patterns; Isolation Forest adds coverage for deviations the ruleset has not seen. The signals are complementary, not redundant.
- Evolving baseline — The model refines its security baseline as more configurations are analyzed. Drift detection flags distributional shifts so operators know when a retrain is warranted.
- Explainable scoring — Every finding ships with its feature vector, rule attribution, and confidence level. Results are auditable, not black-box.
- CI-compatible performance — Sub-second per-file latency makes security gating a viable step in deployment pipelines rather than an offline batch job.
- Baseline training data is synthetic; real-world distributions may differ
- No support for Terraform modules or remote state
- Vulnerability messages and remediation guidance in English only
- AWS coverage only; Azure and GCP provider patterns are not yet encoded
- Multi-cloud coverage (Azure, GCP) with provider-specific rule packs
- Terraform module and remote-state analysis
- Custom policy definition language for organizational rules
- Deeper ML models evaluated against the current Isolation Forest baseline
- Integration with cloud provider native security APIs (AWS Config, etc.)
- Gartner (2024). Cloud Security Failures Report
- IBM Security (2024). Cost of a Data Breach Report
- HashiCorp. Terraform Security Best Practices
- Liu, F. T., Ting, K. M., & Zhou, Z. H. (2008). Isolation Forest. In Proceedings of the Eighth IEEE International Conference on Data Mining (ICDM '08)
Copyright (C) 2025-2026 Gabriel Felipe Guarnieri. All rights reserved.
This project is dual-licensed under the AGPL-3.0 + Commercial License model:
- Open-source use: Licensed under the GNU Affero General Public License v3.0 (AGPL-3.0).
- Commercial use: A proprietary commercial license is available for use cases incompatible with the AGPL-3.0. See LICENSE-COMMERCIAL.md for details.
Retroactive clause: This license applies retroactively to all past commits and versions of this repository, superseding any previously stated license.
For commercial licensing inquiries, contact the author via LinkedIn.
Developed by Gabriel Felipe Guarnieri — UTFPR Software Engineering



