Sentinel

A zero-trust AI gateway for real-time PII masking, dynamic LLM routing, and telemetry logging.

Sentinel provides a secure, asynchronous Python gateway that intercepts data-sensitive AI prompts. It detects and masks PII using Presidio, storing the masked values in a Redis-backed token vault. The system then dynamically routes queries via Langchain and logs telemetry to a serverless SQLite database.

Motivation

Most enterprises (or even small groups) want to adopt AI assistants but face a hard constraint: sensitive data (names, emails, phone numbers) cannot be sent to third-party LLM providers. The typical answer is to either ban cloud LLMs entirely or trust the provider's data handling.

Sentinel sits between the user-facing chatbot (Microsoft Copilot Studio) and the LLMs, acting as a governance layer that strips user-sensitive before any prompt leaves the network, vaults the original values in a TTL-scoped Redis store, and restores them in the response. Simple queries stay on a local Ollama instance that never touches the internet; only complex prompts that need a larger model are forwarded to Gemini, with PII already removed.

Tech Stack

Frontend: Microsoft Copilot Studio (via Ngrok)
API Gateway: Python 3.11, FastAPI, Uvicorn, Pydantic
LLM Orchestration: LangChain (Ollama, Google Gemini)
PII Engine: Microsoft Presidio + spaCy (en_core_web_sm)
State Management: Redis (internal Docker network)
Telemetry: SQLite (host-mounted volume)

Getting Started

Prerequisites

Docker & Docker Compose
A Google Gemini API key
Ngrok account (for Copilot Studio integration)

Setup

Clone the repository:

git clone https://github.com/<your-org>/sentinel.git
cd sentinel

Configure environment variables:

SENTINEL_API_KEY=<your-secure-random-key>
GEMINI_API_KEY=<your-gemini-api-key>

Launch services via Docker:
```
docker compose up --build -d
```

Pull a local model:

docker exec -it llm-local ollama pull llama3

Test the gateway:

curl -X POST http://localhost:8000/v1/chat \
  -H "Content-Type: application/json" \
  -H "X-API-Key: <your-sentinel-api-key>" \
  -d '{"session_id": "test-001", "message": "Hello, how are you?"}'

Connecting Microsoft Copilot Studio

Copilot Studio can use Sentinel as its backend by calling the /v1/chat endpoint through an Ngrok HTTPS tunnel.

# Expose the local gateway to the public internet
ngrok http 8000

Note: Copy the secure forwarding URL generated by Ngrok (e.g., https://abc123.ngrok-free.app).

Navigate to your Microsoft Copilot Studio portal and create a new Custom Connector to link your agent to the Sentinel gateway.

Configure the security and routing settings as follows:

Base URL: <YOUR_NGROK_HTTPS_URL>
Authentication Type: API Key
API Key Parameter Name: X-API-Key
Parameter Location: Header
API Key Value: The value of your .env SENTINEL_API_KEY

Create a new POST action pointing to the /v1/chat endpoint. You must define the Request and Response schemas so Copilot Studio understands the API contract.

Click to view JSON Schemas

Request Payload Schema:

{
  "session_id": "string",
  "message": "string"
}

Response Payload Schema:

{
  "reply": "string",
  "metadata": {
    "routed_to": "string",
    "pii_entities_masked": 0,
    "latency_ms": 0
  }
}

Inside your Copilot Studio dialogue tree, add a node to "Call an action" and select your Sentinel connector. Map the Copilot system variables to the JSON request payload:

message: Map to the user's raw text input (the conversation turn).
session_id: Map to System.Conversation.Id.
Output: Map the API's reply response field to a standard Copilot chat bubble node.

The agent will now route all messages through Sentinel.

Architecture

         [ Client ]
             |
             |    POST /v1/chat (X-API-Key)
             v
+---------------------------------------+
|    1. Auth & Intercept (FastAPI)      |
+---------------------------------------+
             |
             |    (Internal Network)
             v
+---------------------------------------+
|    2. PII Masking (Presidio)          | <---> [ Redis Vault ] 
+---------------------------------------+      
             |
             v
+---------------------------------------+
|    3. Semantic Router (LangChain)     |
+---------------------------------------+
          /                   \
     (Simple)               (Complex)
        /                       \
 [ Local Ollama ]         [ Google Gemini ]
        \                       /
         +----------+----------+
                    |
                    v
+---------------------------------------+
|    4. PII Unmasking                   | <---> [ Redis Vault ]
+---------------------------------------+
                    |
                    | (Async)
                    +------------> [ SQLite Telemetry DB ]
                    |                   (Host Volume)
                    v
            [ JSON Response ]

redis-vault and llm-local are on an internal Docker bridge network with zero host port exposure.
Gemini is called over HTTPS from the gateway container — PII is already stripped before the request leaves.
data/.db is a host volume mount (./data:/data), not a network service.

API Contract

POST /v1/chat — requires X-API-Key header.

Request:

{
  "session_id": "unique-session-identifier",
  "message": "user prompt text"
}

Response (200):

{
  "reply": "unmasked AI response",
  "metadata": {
    "routed_to": "llama-3-local | gemini-3.1-flash-lite-preview",
    "pii_entities_masked": 2,
    "latency_ms": 1205
  }
}

Status	Meaning
`401`	Missing or invalid `X-API-Key`
`429`	Rate limit exceeded (>10 req/min/IP)

Project Structure

sentinel/
├── .env                    # API keys (not committed)
├── docker-compose.yml      # Zero-trust container orchestration
├── Dockerfile              # Gateway image build
├── requirements.txt        # Pinned Python dependencies
│
├── data/                   # Host volume for persistent storage
│   └── telemetry.db        # SQLite (auto-generated at runtime)
│
└── app/
    ├── config.py           # Pydantic BaseSettings (cached)
    ├── main.py             # FastAPI entry point & endpoint orchestration
    ├── schemas.py          # Pydantic request/response models
    ├── security.py         # API key verification & rate limiting
    ├── database.py         # SQLite init & telemetry logging
    ├── vault.py            # Redis + Presidio PII masking/unmasking
    └── router.py           # LangChain LLM routing (Ollama vs Gemini)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sentinel

Motivation

Tech Stack

Getting Started

Prerequisites

Setup

Connecting Microsoft Copilot Studio

Architecture

API Contract

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
app		app
data		data
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Sentinel

Motivation

Tech Stack

Getting Started

Prerequisites

Setup

Connecting Microsoft Copilot Studio

Architecture

API Contract

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages