Skip to content

Preet37/Sentinel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sentinel

Sentinel is a real-time safety gateway for autonomous agents. It sits between agent intent and action execution, scores risk with an LLM-powered policy layer, and enforces human-in-the-loop approval over voice for high-risk operations.

This repository demonstrates a practical pattern for safe autonomous execution across three operational domains:

  • VaultKeeper (FinOps): payments and invoice actions
  • PrivacyShield (Data): export/share access to sensitive data
  • OpsGuard (Infrastructure): destructive and high-blast-radius operations

When an action looks risky, Sentinel blocks execution, calls an admin via Telnyx, and supports DTMF and conversational Q&A before approval/decline.


Why This Project Exists

Autonomous agents can initiate meaningful business actions quickly, but they also introduce new failure modes:

  • approving fraudulent invoices
  • exporting sensitive data without context
  • running destructive infrastructure commands

Sentinel addresses this by adding a dedicated risk control plane:

  1. Agent submits intent (action, payload, reasoning).
  2. Sentinel analyzes risk and applies hard rules/policies.
  3. Low-risk actions auto-approve.
  4. Medium/high-risk actions require voice authentication.
  5. Final decision is observable in UI, logs, and monitoring traces.

The goal is not just blocking; it is controlled autonomy with auditable reasoning.


Repository Structure

Sentinel/
├─ agent/
│  └─ agent.py
├─ backend/
│  ├─ main.py
│  ├─ policy.py
│  ├─ requirements.txt
│  └─ .env
├─ frontend-new/
│  ├─ src/
│  │  ├─ pages/Index.tsx
│  │  └─ components/*.tsx
│  └─ package.json
└─ README.md

agent/

  • Simulated AGI-style agent client.
  • Sends a high-risk PAY_INVOICE action to Sentinel.
  • Polls backend status until approved or declined.

backend/

  • FastAPI service implementing:
    • risk analysis with Groq
    • hard policy overrides
    • action state machine
    • Telnyx voice workflow/webhook handling
    • Sentry tracing/tags

frontend-new/

  • React + Vite dashboard that visualizes:
    • module triggers
    • risk score/status progression
    • terminal-like event feed
    • Telnyx voice auth state
    • monitoring panel (Sentry-oriented UI)

Core Product Concept

Sentinel acts as a policy firewall in front of autonomous systems:

  • It does not trust agent confidence alone.
  • It combines LLM judgment with deterministic policy.
  • It escalates to a human exactly when risk crosses threshold.
  • It records enough context to support operator decisions.

This model is portable across finance, data, and operations workflows.


End-to-End Flow

  1. A module trigger (frontend button or agent.py) sends POST /api/sentinel/execute.
  2. Backend sets status to ANALYZING.
  3. Groq generates:
    • risk_score (0-100)
    • short explanation (analysis)
  4. Sentinel applies deterministic module rules:
    • e.g. PII export restrictions, hard block on DROP_TABLE
  5. Sentinel applies demo bucketing for predictable scenarios.
  6. Decision:
    • risk_score <= 50 -> auto-approve (APPROVED)
    • risk_score > 50 -> BLOCKED_AWAITING_AUTH + Telnyx outbound call
  7. Admin interaction:
    • DTMF 1: approve
    • DTMF 2: enter Q&A mode
    • speech Q&A: asks questions, receives Groq explanations, then approves/declines verbally
  8. Final state is reflected via:
    • GET /api/sentinel/status
    • frontend status/log components
    • optional Sentry instrumentation

Module Risk Behavior

1) VaultKeeper (FinOps)

  • Primary action: PAY_INVOICE
  • Demo high-risk case: $10,000 to "Unknown Corp" from a session_* agent -> high-risk + voice auth
  • Low-risk examples: small payments to trusted vendors

2) PrivacyShield (Data)

  • Actions: EXPORT_CSV, SHARE_RECORD, QUERY_SSN
  • Escalates when:
    • record volume is significant
    • payload includes PII/SSN indicators
  • Demonstrates exfiltration-aware policy behavior

3) OpsGuard (Infrastructure)

  • Actions include: DELETE_USER, DROP_TABLE, RESTART_SERVER
  • DROP_TABLE is hard-blocked (declined immediately)
  • Production-impact user deletion escalates
  • Non-production restarts are treated as lower-risk

Backend API

Base URL (local): http://localhost:8000

POST /api/sentinel/execute

Submits an agent action for analysis.

Request shape:

{
  "agent_id": "string",
  "action": "string",
  "payload": {},
  "reasoning": "string"
}

Possible response statuses:

  • EXECUTED (auto-approved path)
  • BLOCKED_AWAITING_AUTH (requires Telnyx approval)
  • DECLINED (hard block)
  • ERROR_TELNYX (auth channel failure)

GET /api/sentinel/status

Returns global runtime state:

  • current status
  • risk score
  • latest analysis
  • latest DTMF/question/answer artifacts

POST /api/telnyx/webhook

Receives Telnyx events:

  • call.answered
  • call.dtmf.received
  • call.gather.ended

Tech Stack

Backend

  • Python
  • FastAPI
  • Uvicorn
  • Pydantic
  • Groq API (Llama 3.3 model invocation)
  • Telnyx Voice API
  • Sentry SDK

Frontend

  • React + TypeScript
  • Vite
  • Tailwind CSS
  • shadcn-ui / Radix primitives
  • Axios

Local Setup

1) Prerequisites

  • Python 3.10+
  • Node.js 18+
  • npm
  • Groq account/API key
  • Telnyx account/API key and voice connection
  • (Optional) Sentry project DSN

2) Configure environment variables

Create/update backend/.env and agent/.env.

Backend expected variables:

  • TELNYX_API_KEY
  • TELNYX_PHONE_NUMBER
  • ADMIN_PHONE_NUMBER
  • TELNYX_CONNECTION_ID (optional in code, has default)
  • GROQ_API_KEY

Agent expected variables:

  • AGI_API_KEY

3) Run backend

From backend/:

pip install -r requirements.txt
uvicorn main:app --reload --port 8000

4) Run frontend

From frontend-new/:

npm install
npm run dev

Default Vite URL is typically http://localhost:5173.

5) Run agent simulator (optional)

From agent/:

pip install requests python-dotenv
python agent.py

The agent will submit a high-risk invoice scenario and wait for approval status updates.


Demo Walkthrough

  1. Start backend and frontend.
  2. Open frontend dashboard.
  3. Trigger one of:
    • PAY_INVOICE (High Risk)
    • EXPORT_CSV (Medium Risk)
    • DELETE_USER (Medium Risk)
  4. Watch Sentinel state move through monitoring/analyzing/blocked or approved.
  5. If blocked, answer Telnyx call:
    • press 1 to approve
    • press 2 for spoken Q&A mode
  6. Confirm terminal feed + shield status + risk card update.
  7. (Optional) run agent.py for session-based AGI demo path.

Important Design Notes

  • Global in-memory state: CURRENT_STATE is process-local and not persistent.
  • Single-process demo assumptions: concurrent multi-tenant usage is not modeled yet.
  • Policy layering: combines LLM analysis with hard deterministic safeguards.
  • Human-in-the-loop: high-risk path requires explicit approval channel.
  • Observability hooks: Sentry transactions/tags are integrated in execution flow.

Security & Production Hardening Checklist

If you evolve this into production, prioritize:

  • Replace global state with durable store (Redis/Postgres/event log).
  • Add authn/authz for all API endpoints.
  • Validate and sign webhook requests (Telnyx signature verification).
  • Move all secrets to secure secret manager.
  • Remove hardcoded DSN values from source.
  • Add idempotency and replay protection for action execution.
  • Add structured audit logging (who approved, channel, time, reason).
  • Add robust retry and timeout policies for external APIs.
  • Add test suites for:
    • policy edge-cases
    • webhook event parsing
    • approval/decline state transitions

Known Limitations

  • Demo-first risk bucketing intentionally constrains scenarios.
  • Some frontend monitoring widgets are illustrative, not live-linked.
  • No database persistence for incidents, approvals, or replay history.
  • Webhook flow behavior may vary with Telnyx account capabilities (speech gather settings).

Suggested Next Iterations

  • Persist action lifecycle to a datastore and add incident timeline UI.
  • Introduce policy versioning and per-module policy packs.
  • Add role-based and risk-tiered approval routing.
  • Add Slack/Teams fallback when voice call is not answered.
  • Add simulation harness for regression testing of risk policies.

License

No license file is currently included in this repository. Add one before external distribution.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors