Skip to content

surajgojanur/RootGuardians

Repository files navigation

🛡 RootGuardians

Continuous Security Control Assurance Platform

Baseline-relative. Deterministic. Audit-ready.

Python 3.10+ FastAPI React 18 SQLite License: MIT

Detection Rate False Positives Controls Live VMs

Société Générale Hackathon 2026 — Problem Statement 02 Security Control Drift & Misconfiguration Detection


The Problem

40% of breaches come from misconfigured controls, not missing ones. The control was approved, deployed correctly — and then it silently changed. A "temporary" firewall rule stays open for two years. Audit logging is disabled for maintenance and never re-enabled. Encryption is quietly downgraded. Controls change daily. Nobody notices. RootGuardians does.

What RootGuardians Does

  • Detects drift, not gaps. It compares every live control against the baseline you approved and flags any deviation — with full evidence — in seconds.
  • Tells you why it matters. Every finding carries a deterministic risk score, a compliance-impact map across 7 frameworks, plain-English business impact, and copy-ready remediation.
  • Proves it works. It self-scores 100% detection / 0% false positives on the PS2 labeled dataset, logs every scan to a database, and exports an 11-page auditor-ready PDF in one click.

Demo Video

▶️ Watch the full walkthrough (docs/media/demo-video.mp4) — a complete tour: clean baseline, live drift injection, instant detection, attack path, and the one-click audit report.

If the video doesn't play inline on GitHub, click the link to download it. (Not committed yet? See docs/MEDIA_GUIDE.md for how to add demo-video.mp4 or swap in a hosted link.)

Live Demo

RootGuardians demo

Clean baseline → emergency drift injected → score drops live → attack path detected → audit report
Dashboard — posture 100
Healthy baseline — posture 100.0, every control compliant.
Dashboard — posture 53.5
Emergency drift — posture 53.5, attack path lit, severity cards red.
Finding detail
Finding detail — confidence, compliance chips, remediation playbook.
Audit report
Audit report — one-click, auditor-ready, print-to-PDF.

Key Numbers

🛡 16 Security controls monitored (SSH, firewall, audit, MySQL, Nginx, Docker, Fail2ban, AWS SG)
🎯 100% Detection rate on the PS2 labeled dataset (1,000 events)
0% False-positive rate
☁️ 2 Live Oracle Cloud VMs scanned over agentless SSH
< 1 sec To classify the full 1,000-event dataset (≈10,000 events/sec)
📄 11-page Auto-generated, auditor-ready PDF report
🗂 7 Compliance frameworks mapped (CIS · NIST · ISO 27001 · PCI-DSS · GDPR · RBI-CSF · NIST Zero Trust)

Architecture Overview

RootGuardians System Architecture

                         ┌─────────────────────────────────────────────┐
                         │            Your Laptop / Operator             │
                         │                                               │
   ┌─────────────┐  UI   │   React Frontend (Vite)  ── http :5173        │
   │   Browser   │◀──────┤        │  Dashboard · Drift Lab · VMs ·        │
   └─────────────┘       │        │  History · Evaluation                 │
                         │        ▼  REST / JSON                          │
                         │   FastAPI Backend  ── http :8000              │
                         │        │  deterministic engine + SQLite        │
                         └────────┼──────────────────────────────────────┘
                                  │  agentless SSH (read-only, Paramiko)
                  ┌───────────────┴────────────────┐
                  ▼                                 ▼
         ┌──────────────────┐              ┌──────────────────┐
         │  Oracle Cloud VM1 │              │  Oracle Cloud VM2 │
         │  instance-…0119   │              │  secondvm         │
         │  16 live controls │              │  16 live controls │
         └──────────────────┘              └──────────────────┘

   Optional outbound: OpenAI (AI chat / explanations) · EmailJS (alert emails)

The detection core is a pure deterministic pipeline — given the same inputs it always produces the same outputs. AI is an optional layer that only ever rewrites human-readable prose; it never touches a number.

Sources ─▶ Connectors ─▶ Normalizer ─▶ Drift Engine ─▶ Compliance ─▶ Risk + Confidence
                                                                          │
   Attack-Path ◀─ Explanation ◀─ Remediation ◀─ Exception (waiver) ◀──────┘
        │
        ▼
   Dashboard · SQLite History · Evaluation Scoreboard · PDF/CSV Audit Export

Features

Feature Description PS2 Requirement
Baseline registry Versioned baseline.json of 16 approved controls (expected state, criticality, compliance). Establish
Agentless SSH scanning Read-only Paramiko collectors snapshot live Linux hosts; nothing is installed on the target. Monitor
AWS Security-Group connector Same engine assesses cloud SG ingress rules (sample dataset). Monitor
Deterministic drift engine Joins observations against the baseline; any deviation from expected_state is drift, with evidence. Detect
Risk + posture scoring Auditable criticality × exposure × compliance formula; baseline-relative posture (100 = perfect). Detect / Alert
Compliance mapping Every control tagged to 7 frameworks; chips render in UI and report. Alert
Plain-English explanations What changed, why it matters, business impact, fix — per finding (template, optional AI rewrite). Alert
Confidence score Per-finding 0–100 detection confidence from recurrence + exposure + compliance. Alert
Attack-path analysis Correlates 2+ co-located active risks into a blast-radius attack chain. Track
Waiver governance Time-bounded, auto-expiring exceptions; authorized change ≠ breach. Track
SQLite scan history Every scan persisted; posture timeline, drift events, top risks, operator risk. Track
PDF / CSV audit export 11-page auditor report + CSV, filterable by VM and date range. Track
Evaluation scoreboard Self-scores against the labeled dataset: TP/FP/TN/FN, precision, recall, FPR. Detect
Email alerts EmailJS alerts on drift / waiver expiry, severity-filtered, deduplicated. Alert
AI assistant Context-aware GPT-4o-mini chat answering posture questions in plain English. Alert
Drift Lab Inject/reset drift on a live VM to demonstrate end-to-end detection. Demo

Real-Time Alerts

RootGuardians pushes drift alerts to Email and Slack the moment a control drifts from baseline — severity-filtered and deduplicated so you only hear about what matters.

Email Alert
Email alert — drift pushed to your inbox the instant it's detected.
Slack Alert
Slack alert — the same drift posted to your channel via Block Kit.

Waivers

An authorized change isn't a breach. When a drift is legitimate, grant a time-bounded waiver — with a reason, an approver, and an expiry — and that risk stops counting against your posture until it auto-expires. A live countdown keeps the exception honest: it can't be quietly forgotten.

Active Waiver with Countdown


Tech Stack

Layer Technology
Frontend React 18, Vite, hand-written CSS, EmailJS (alerts), OpenAI (chat via backend)
Backend FastAPI, Python 3.10+, SQLite (scan history), Paramiko (agentless SSH), Pydantic v2
AI GPT-4o-mini — chat + optional explanation rewriting · deterministic engine for all detection
Infrastructure Oracle Cloud — 2× VM.Standard.E2.1.Micro (Ubuntu 20.04)

Quick Start

Prerequisites

  • Python 3.10+, Node.js 18+, Git
  • (Optional) an Ubuntu 20.04 VM reachable over SSH for live scanning — sample mode needs nothing.

Installation

git clone https://github.com/surajgojanur/RootGuardians.git
cd RootGuardians
cp .env.example .env          # then edit .env with your keys (optional for sample mode)

Running the Backend

pip install -r backend/requirements.txt
cd backend && uvicorn main:app --port 8000
# API on http://localhost:8000  ·  interactive docs at /docs

Running the Frontend

cd frontend
npm install
npm run dev                   # http://localhost:5173

Adding Your First VM

  1. Open the VMs tab → Add a VM.
  2. Enter host/IP, port, username, and choose your SSH private key file.
  3. Click Add VM — the key is held in backend memory only (never written to disk), the first scan runs immediately, and the VM joins the fleet on the Dashboard.

Prefer the terminal? python drift_detector.py --profile drifted runs a deterministic scan over the sample data and prints the posture, summary, and findings table.


Project Structure

RootGuardians/
├── README.md                       # this file
├── CHANGELOG.md                    # release history
├── drift_detector.py               # PS2 CLI entry point (forwards to cli/controlguard_cli.py)
├── .env.example                    # environment template (safe to commit)
├── docs/
│   ├── README.md                   # documentation index (links to every guide)
│   ├── TECHNICAL.md                # architecture, detection methodology, scoring, schemas
│   ├── USER_GUIDE.md               # non-technical guide for managers / auditors
│   ├── SETUP.md                    # developer install + VM hardening + troubleshooting
│   ├── MEDIA_GUIDE.md              # screenshot / GIF capture checklist for contributors
│   ├── DEMO_SCRIPT.md              # exact 5-minute presentation script + judge Q&A
│   ├── demo.gif                    # recorded walk-through
│   ├── media/                      # screenshots + GIFs referenced by README / User Guide
│   └── screenshots/                # original capture set (+ capture README)
├── backend/
│   ├── main.py                     # FastAPI app (uvicorn main:app)
│   ├── requirements.txt            # fastapi, uvicorn, pydantic, paramiko, openai, dotenv
│   └── controlguard/
│       ├── orchestrator.py         # runs the full deterministic pipeline + posture math
│       ├── models.py               # Pydantic schemas: Observation, Finding, ScanResult, Waiver
│       ├── store.py                # JSON I/O — baseline, waivers, scans, templates
│       ├── api/
│       │   ├── routes.py           # all REST endpoints (thin layer over the engine)
│       │   ├── scan_history.py     # SQLite scan-history store + audit queries
│       │   ├── evaluation.py       # PS2 scoreboard: classify vs ground truth, metrics
│       │   ├── attack_path.py      # blast-radius / multi-vector attack chains
│       │   ├── drift_history.py    # read-only analytics over the labeled CSV dataset
│       │   ├── sg_data.py          # Société Générale provided-dataset analysis
│       │   └── waivers.py          # time-bounded waiver overlay
│       ├── connectors/
│       │   ├── base.py             # BaseConnector contract — collect() → Observations
│       │   ├── linux.py            # sample-file Linux collector (clean/drifted)
│       │   ├── linux_ssh.py        # live agentless SSH collector (Paramiko)
│       │   ├── aws_sg.py           # AWS security-group collector
│       │   ├── vm_registry.py      # multi-VM registry (keys in memory, metadata on disk)
│       │   └── vm_control.py       # Drift Lab control plane (inject/reset drift)
│       ├── engine/
│       │   ├── drift.py            # baseline-relative drift detection
│       │   ├── risk.py             # deterministic risk_score, severity + confidence
│       │   ├── compliance.py       # framework mapping
│       │   ├── exceptions.py       # waiver resolution (approved vs expired)
│       │   ├── remediation.py      # attaches remediation playbooks
│       │   ├── explanation.py      # builds explanation fields (+ optional AI)
│       │   └── ai.py               # cached AI rewrite — OpenAI/Anthropic, off by default
│       ├── report/                 # server-rendered HTML scan + evaluation reports
│       └── data/                   # baseline.json, waivers, samples, scan_history.db (gitignored)
├── frontend/
│   └── src/
│       ├── App.jsx                 # dashboard shell, 5 tabs, alert provider, AI chat
│       ├── api.js                  # fetch wrappers for the REST API
│       ├── components/             # Dashboard, DriftLab, VmManager, ScanHistory,
│       │                           #   AuditReport, EvaluationBoard, AiChat, AlertSettings …
│       ├── hooks/useCountdown.js   # live waiver-expiry countdown
│       └── utils/                  # exportPDF, exportCSV, formatters
├── cli/                            # controlguard_cli.py, warm_ai_cache.py
├── notebooks/                      # RootGuardians_SG_Analysis.ipynb (PS2 analysis)
├── sample_data/                    # PS2 synthetic labeled dataset (1,000 events)
└── sample_data_by_societegenerale/ # the provided SG dataset (1,000 events)

PS2 Deliverables Checklist

  • GitHub repo with drift_detector.py entry point
  • Jupyter notebook (notebooks/RootGuardians_SG_Analysis.ipynb)
  • 20+ drifts flagged with explanations (115 anomalies detected on the labeled set, each explained)
  • Interactive dashboard (React, 5 tabs, live + sample modes)
  • Technical documentation (docs/TECHNICAL.md, USER_GUIDE.md, SETUP.md)
  • Audit report export (11-page PDF + CSV)
  • 5-minute presentation (script ready in docs/DEMO_SCRIPT.md)

PS2 Success Criteria Results

Metric Target RootGuardians
Detection Rate > 80% 100%
False Positive Rate < 15% 0%
Time Lag < 1 hour < 10 seconds (on-demand rescan)
Explainability Every alert ✅ AI + deterministic templates
Compliance Mapping NIST / CIS / GDPR 7 frameworks

Evaluation Results

Measured by the built-in Evaluation tab against the labeled PS2 dataset (1,000 events). Ground truth is derived from the data itself (status + change type + severity); the classifier uses the same deterministic logic the product uses.

                            Predicted
                     Anomalous     Benign
        Anomalous       115          0          ← 0 false negatives
Actual
        Benign            0         885         ← 0 false positives
Precision: 100%   |   Recall: 100%   |   F1: 100%   |   Accuracy: 100%

Because the dataset's severity domain is exactly {Critical, High, Medium, Low, Info}, recognizing authorized change types (rollback / scheduled change) and low-severity noise as benign yields a perfect, honest separation — shown transparently in the UI (confusion matrix, per-severity breakdown, and the exact FP/FN lists, which are empty).


Security Design

  • Read-only collectors — the SSH connector only runs non-mutating commands (cat, stat, ss, systemctl is-active). A scan never writes to the target.
  • Keys in memory only — uploaded SSH keys live in the backend process memory; only non-secret VM metadata (name, host, port, username) is persisted. Keys never touch disk and never appear in logs or API responses.
  • Deterministic detection — no black-box model decides risk. Every score is a reproducible formula a regulator can audit.
  • Secrets via environmentOPENAI_API_KEY is read server-side only; the AI chat is proxied through the backend so the key never reaches the browser. .env, target.json, scan artifacts, and the SQLite DB are git-ignored.

API Reference

Base URL: http://localhost:8000 · Interactive docs: http://localhost:8000/docs

Method Path Description
POST /api/scan Run a sample (clean/drifted) or live (linux_ssh) scan.
GET /api/scans/latest Latest persisted scan result.
POST /api/vms Register a VM (multipart: host, port, username, key file).
GET /api/vms List registered VMs (key-free view; key_missing flag).
POST /api/vms/{id}/scan Scan one registered VM over SSH.
POST /api/vms/{id}/rekey Re-upload a key for a VM restored after a restart.
POST /api/vm/control Drift Lab control plane (status / drift / reset / per-control toggle).
GET /api/history/scans Persisted scan history (filter: days, asset).
GET /api/history/timeline Posture timeline points.
GET /api/history/assets Distinct scanned assets (for the export VM picker).
GET /api/history/top-risks Top 10 riskiest drift findings across history.
GET /api/evaluation PS2 scoreboard (TP/FP/TN/FN, precision, recall, FPR).
POST /api/ai/chat Context-aware security assistant (GPT-4o-mini, server-side key).
GET / POST / DELETE /api/waivers List / grant / revoke time-bounded waivers.

Example — run a drifted scan

curl -s -X POST http://localhost:8000/api/scan \
  -H "Content-Type: application/json" \
  -d '{"profile":"drifted"}'
{
  "scan_id": "scan-20260614-...",
  "posture_score": 53.5,
  "summary": { "total_controls": 16, "drift_count": 5, "active_risks": 3, ... },
  "findings": [ { "control_id": "ssh-password-auth-disabled", "drift_detected": true,
                  "risk_score": 9.0, "severity": "medium", "exception_status": "active_risk",
                  "compliance": [ {"framework":"CIS","id":"5.4.4"}, ... ] }, ... ]
}

Example — ask the AI assistant

curl -s -X POST http://localhost:8000/api/ai/chat \
  -H "Content-Type: application/json" \
  -d '{"message":"What should I fix first?",
       "context":{"posture_score":53.5,"drift_count":5,
                  "active_risks":[{"title":"SSH password auth","severity":"medium","risk_score":9.0}],
                  "asset":"instance-20260606-0119"}}'
# → {"reply": "Prioritise the AWS database port exposed to the internet ..."}

Controls Reference

All 16 controls live in backend/data/baseline/baseline.json.

Control ID Title System Criticality Compliance
ssh-root-login-disabled SSH root login disabled Linux Critical CIS 5.4.10 · ISO A.9.2.3 · PCI 8.2.1 · NIST AC-2/IA-2
ssh-password-auth-disabled SSH password authentication disabled Linux High CIS 5.4.4 · ISO A.9.4.2 · NIST IA-5/AC-17
firewall-enabled Firewall enabled Linux High CIS 3.5.1.1 · ISO A.13.1.1 · PCI 1.2.1 · NIST SC-7
audit-logging-enabled Audit logging enabled Linux Medium CIS 4.1.1.1 · ISO A.12.4.1 · PCI 10.2.1 · GDPR Art.32
mysql-3306-not-exposed MySQL port 3306 not exposed publicly Linux Critical CIS 3.5.1.2 · ISO A.13.1.3 · PCI 1.3.1 · GDPR Art.32
docker-socket-not-exposed Docker socket not exposed Linux High CIS 2.8 · ISO A.13.1.3 · NIST CM-7/AC-6
sensitive-files-not-world-writable Sensitive files not world-writable Linux Medium CIS 6.1.10 · ISO A.9.4.1 · PCI 7.1.1 · GDPR Art.25
aws-sg-ssh-not-public AWS SG must not expose SSH (22) to 0.0.0.0/0 AWS Critical CIS 5.2 · ISO A.13.1.1 · PCI 1.2.1 · NIST SC-7
aws-sg-db-not-public AWS SG must not expose DB (3306) to 0.0.0.0/0 AWS High CIS 5.2 · ISO A.13.1.3 · PCI 1.3.1 · GDPR Art.32
mysql-root-password-set MySQL root account has a password set Linux Critical CIS 4.1 · PCI 8.2.1 · ISO A.9.4.3
mysql-bind-localhost MySQL binds to localhost only Linux Critical CIS 6.1 · PCI 1.3.1 · ISO A.13.1.3 · NIST SC-7
nginx-running Nginx web server is running Linux Medium CIS 3.5 · ISO A.12.1.2
nginx-default-page-disabled Nginx default page is disabled Linux Low CIS 2.2.4 · ISO A.14.2.5
docker-running Docker daemon is running Linux Medium CIS 2.1 · ISO A.12.1.2
docker-socket-permissions Docker socket is not world-readable Linux High CIS 2.8 · NIST AC-6 · ISO A.9.4.1
fail2ban-active Fail2ban brute-force protection is active Linux High CIS 5.3.4 · NIST AC-7 · ISO A.9.4.2 · PCI 8.1.6

Limitations & Future Work

We are honest about what is MVP vs production:

  • Storage — JSON files + SQLite are perfect for the demo; production would use PostgreSQL / a time-series store.
  • Authentication — the console has no auth today; production needs OAuth2 / SSO + RBAC.
  • Scale — 2 live VMs are wired up; the registry architecture supports unlimited hosts (each scan is independent work).
  • AWS — the AWS security-group controls run against a sample dataset; production would call the real AWS EC2/VPC SDK.
  • Alerting — email via EmailJS today; production would add Slack / PagerDuty and server-side scheduled scans.

See docs/TECHNICAL.md for the full architecture and docs/DEMO_SCRIPT.md for the presentation flow.


Documentation

Guide For
docs/TECHNICAL.md Developers — architecture, detection methodology, scoring formulas, schemas.
docs/USER_GUIDE.md End users & auditors — plain-English walkthrough of every feature, FAQ, glossary.
docs/SETUP.md Operators — install, VM hardening, environment, troubleshooting.
docs/MEDIA_GUIDE.md Contributors — screenshot & GIF capture checklist.
docs/DEMO_SCRIPT.md Presenters — 5-minute script and judge Q&A.
notebooks/ Analysts — Jupyter analysis of the PS2 / Société Générale dataset.

A full index lives at docs/README.md.


Team

RootGuardians — built for the Société Générale Hackathon 2026, Problem Statement 02: Security Control Drift & Misconfiguration Detection.

Built by Suraj Gojanur and Deep Saha.

License

Released under the MIT License.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors