Skip to content

vin-jl/sentinel

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sentinel

A zero-trust AI gateway for real-time PII masking, dynamic LLM routing, and telemetry logging.

Sentinel provides a secure, asynchronous Python gateway that intercepts data-sensitive AI prompts. It detects and masks PII using Presidio, storing the masked values in a Redis-backed token vault. The system then dynamically routes queries via Langchain and logs telemetry to a serverless SQLite database.


Motivation

Most enterprises (or even small groups) want to adopt AI assistants but face a hard constraint: sensitive data (names, emails, phone numbers) cannot be sent to third-party LLM providers. The typical answer is to either ban cloud LLMs entirely or trust the provider's data handling.

Sentinel sits between the user-facing chatbot (Microsoft Copilot Studio) and the LLMs, acting as a governance layer that strips user-sensitive before any prompt leaves the network, vaults the original values in a TTL-scoped Redis store, and restores them in the response. Simple queries stay on a local Ollama instance that never touches the internet; only complex prompts that need a larger model are forwarded to Gemini, with PII already removed.


Tech Stack

  • Frontend: Microsoft Copilot Studio (via Ngrok)
  • API Gateway: Python 3.11, FastAPI, Uvicorn, Pydantic
  • LLM Orchestration: LangChain (Ollama, Google Gemini)
  • PII Engine: Microsoft Presidio + spaCy (en_core_web_sm)
  • State Management: Redis (internal Docker network)
  • Telemetry: SQLite (host-mounted volume)

Getting Started

Prerequisites

  • Docker & Docker Compose
  • A Google Gemini API key
  • Ngrok account (for Copilot Studio integration)

Setup

  1. Clone the repository:

    git clone https://github.com/<your-org>/sentinel.git
    cd sentinel
  2. Configure environment variables:

    SENTINEL_API_KEY=<your-secure-random-key>
    GEMINI_API_KEY=<your-gemini-api-key>
    
  3. Launch services via Docker:

    docker compose up --build -d
  4. Pull a local model:

    docker exec -it llm-local ollama pull llama3
  5. Test the gateway:

    curl -X POST http://localhost:8000/v1/chat \
      -H "Content-Type: application/json" \
      -H "X-API-Key: <your-sentinel-api-key>" \
      -d '{"session_id": "test-001", "message": "Hello, how are you?"}'

Connecting Microsoft Copilot Studio

Copilot Studio can use Sentinel as its backend by calling the /v1/chat endpoint through an Ngrok HTTPS tunnel.

# Expose the local gateway to the public internet
ngrok http 8000

Note: Copy the secure forwarding URL generated by Ngrok (e.g., https://abc123.ngrok-free.app).

Navigate to your Microsoft Copilot Studio portal and create a new Custom Connector to link your agent to the Sentinel gateway.

Configure the security and routing settings as follows:

  • Base URL: <YOUR_NGROK_HTTPS_URL>
  • Authentication Type: API Key
  • API Key Parameter Name: X-API-Key
  • Parameter Location: Header
  • API Key Value: The value of your .env SENTINEL_API_KEY

Create a new POST action pointing to the /v1/chat endpoint. You must define the Request and Response schemas so Copilot Studio understands the API contract.

Click to view JSON Schemas

Request Payload Schema:

{
  "session_id": "string",
  "message": "string"
}

Response Payload Schema:

{
  "reply": "string",
  "metadata": {
    "routed_to": "string",
    "pii_entities_masked": 0,
    "latency_ms": 0
  }
}

Inside your Copilot Studio dialogue tree, add a node to "Call an action" and select your Sentinel connector. Map the Copilot system variables to the JSON request payload:

  1. message: Map to the user's raw text input (the conversation turn).
  2. session_id: Map to System.Conversation.Id.
  3. Output: Map the API's reply response field to a standard Copilot chat bubble node.

The agent will now route all messages through Sentinel.


Architecture

         [ Client ]
             |
             |    POST /v1/chat (X-API-Key)
             v
+---------------------------------------+
|    1. Auth & Intercept (FastAPI)      |
+---------------------------------------+
             |
             |    (Internal Network)
             v
+---------------------------------------+
|    2. PII Masking (Presidio)          | <---> [ Redis Vault ] 
+---------------------------------------+      
             |
             v
+---------------------------------------+
|    3. Semantic Router (LangChain)     |
+---------------------------------------+
          /                   \
     (Simple)               (Complex)
        /                       \
 [ Local Ollama ]         [ Google Gemini ]
        \                       /
         +----------+----------+
                    |
                    v
+---------------------------------------+
|    4. PII Unmasking                   | <---> [ Redis Vault ]
+---------------------------------------+
                    |
                    | (Async)
                    +------------> [ SQLite Telemetry DB ]
                    |                   (Host Volume)
                    v
            [ JSON Response ]
  • redis-vault and llm-local are on an internal Docker bridge network with zero host port exposure.
  • Gemini is called over HTTPS from the gateway container — PII is already stripped before the request leaves.
  • data/.db is a host volume mount (./data:/data), not a network service.

API Contract

POST /v1/chat — requires X-API-Key header.

Request:

{
  "session_id": "unique-session-identifier",
  "message": "user prompt text"
}

Response (200):

{
  "reply": "unmasked AI response",
  "metadata": {
    "routed_to": "llama-3-local | gemini-3.1-flash-lite-preview",
    "pii_entities_masked": 2,
    "latency_ms": 1205
  }
}
Status Meaning
401 Missing or invalid X-API-Key
429 Rate limit exceeded (>10 req/min/IP)

Project Structure

sentinel/
├── .env                    # API keys (not committed)
├── docker-compose.yml      # Zero-trust container orchestration
├── Dockerfile              # Gateway image build
├── requirements.txt        # Pinned Python dependencies
│
├── data/                   # Host volume for persistent storage
│   └── telemetry.db        # SQLite (auto-generated at runtime)
│
└── app/
    ├── config.py           # Pydantic BaseSettings (cached)
    ├── main.py             # FastAPI entry point & endpoint orchestration
    ├── schemas.py          # Pydantic request/response models
    ├── security.py         # API key verification & rate limiting
    ├── database.py         # SQLite init & telemetry logging
    ├── vault.py            # Redis + Presidio PII masking/unmasking
    └── router.py           # LangChain LLM routing (Ollama vs Gemini)

About

A zero-trust AI gateway for real-time PII masking, dynamic LLM routing, and telemetry logging.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors