diff --git a/README.md b/README.md index 2c6e1a1..0ddfb0c 100644 --- a/README.md +++ b/README.md @@ -10,17 +10,22 @@ ## What it does -Egressor is a local HTTPS proxy that intercepts outbound traffic from developer tools (Claude Code, Kiro, Cursor, etc.), showing you exactly what data — including which files — are being sent to LLM APIs. +Egressor sits between your AI coding tools and the LLM APIs they talk to. It intercepts every HTTPS request, shows you exactly what's being sent — including which files — and lets you block anything that shouldn't leave your machine. -- **TLS interception** — decrypts and inspects HTTPS payloads via a local CA +It's built for developers who use tools like Claude Code, Kiro, or Cursor and want to know (and control) what data those tools send to external APIs. + +- **TLS interception** — decrypts and inspects HTTPS payloads using a local CA - **File detection** — identifies file paths and contents in API request bodies -- **Directory scope enforcement** — blocks requests referencing files outside allowed project directories -- **File blocking** — prevents sensitive files (`.env`, `.pem`, secrets) from being sent +- **Directory scope** — restricts which directories tools can access +- **File pattern blocking** — hard-blocks sensitive files like `.env`, `.pem`, and secrets +- **Content tags** — developers mark files with `// NO_LLM` to prevent them from being sent +- **Content keywords** — flags requests containing words like "CONFIDENTIAL" and asks what to do - **Desktop UI** — real-time session inspector with request/response viewer +- **System tray** — menu bar icon on macOS with status and quick controls - **Audit logging** — structured JSON logs with automatic rotation ``` -Developer Tool ──HTTPS──► Egressor ──HTTPS──► LLM API +Developer Tool ──HTTPS──> Egressor ──HTTPS──> LLM API (inspect) (detect files) (block if denied) @@ -34,12 +39,16 @@ Developer Tool ──HTTPS──► Egressor ──HTTPS──► LLM API ### Install -**Homebrew:** +**Homebrew (macOS):** ```bash brew tap ehsaniara/tap brew install egressor ``` +**GitHub Releases (all platforms):** + +Download the latest binary from [Releases](https://github.com/ehsaniara/egressor/releases), extract, and run. + **From source (macOS):** ```bash git clone https://github.com/ehsaniara/egressor.git @@ -48,7 +57,7 @@ cd internal/ui/frontend && npm install && npm run build && cd ../../.. CGO_LDFLAGS="-framework UniformTypeIdentifiers" go build -tags production -o egressor ./cmd/egressor ``` -**From source (Windows, requires Go 1.24+ and Node.js 22+):** +**From source (Windows — requires Go 1.24+ and Node.js 22+):** ```powershell git clone https://github.com/ehsaniara/egressor.git @@ -57,7 +66,7 @@ cd internal\ui\frontend; npm install; npm run build; cd ..\..\.. go build -tags production -o egressor.exe ./cmd/egressor ``` -**From source (Linux, headless):** +**From source (Linux — headless only):** ```bash git clone https://github.com/ehsaniara/egressor.git @@ -65,15 +74,15 @@ cd egressor go build -o egressor ./cmd/egressor ``` -### Setup +### First run -On first run, Egressor auto-generates a CA certificate and prints trust instructions: +Egressor auto-generates a CA certificate the first time you run it and prints setup instructions: ```bash ./egressor ``` -Trust the CA (required for TLS interception): +You'll need to trust the CA so Egressor can intercept TLS traffic. This is a one-time step. **macOS:** ```bash @@ -81,13 +90,6 @@ sudo security add-trusted-cert -d -r trustRoot \ -k /Library/Keychains/System.keychain ~/.egressor/ca.pem ``` -**Windows (PowerShell as Administrator):** - -```powershell -Import-Certificate -FilePath "$env:USERPROFILE\.egressor\ca.pem" ` - -CertStoreLocation Cert:\LocalMachine\Root -``` - **Linux:** ```bash @@ -95,123 +97,176 @@ sudo cp ~/.egressor/ca.pem /usr/local/share/ca-certificates/egressor.crt sudo update-ca-certificates ``` -### Configure your tools +### Configure your tools (macOS / Linux) -Set the proxy and CA certificate for your LLM tools: +Tell your LLM tools to route traffic through Egressor and trust its CA: -**macOS / Linux:** ```bash export NODE_EXTRA_CA_CERTS=~/.egressor/ca.pem export HTTPS_PROXY=http://127.0.0.1:8080 ``` -**Windows (PowerShell):** +Then launch your tool. All HTTPS traffic now flows through Egressor. + +### Windows step-by-step + +**1. Download and extract** Egressor from [GitHub Releases](https://github.com/ehsaniara/egressor/releases), or build from source. + +**2. Run Egressor once** to auto-generate the CA certificate: +```powershell +.\egressor.exe +``` + +**3. Trust the CA** (run PowerShell as Administrator): +```powershell +Import-Certificate -FilePath "$env:USERPROFILE\.egressor\ca.pem" ` + -CertStoreLocation Cert:\LocalMachine\Root +``` +**4. Set environment variables** so your LLM tools route through Egressor: ```powershell $env:NODE_EXTRA_CA_CERTS = "$env:USERPROFILE\.egressor\ca.pem" $env:HTTPS_PROXY = "http://127.0.0.1:8080" ``` -Then launch your tool — all HTTPS traffic flows through Egressor. +To make these persist across sessions: +```powershell +[Environment]::SetEnvironmentVariable("NODE_EXTRA_CA_CERTS", "$env:USERPROFILE\.egressor\ca.pem", "User") +[Environment]::SetEnvironmentVariable("HTTPS_PROXY", "http://127.0.0.1:8080", "User") +``` + +**5. Start Egressor:** +```powershell +.\egressor.exe +``` + +**6. Launch your LLM tool** (Claude Code, Kiro, Cursor, etc.). All traffic to LLM APIs now flows through Egressor. + +To stop intercepting, close Egressor and remove the proxy variable: +```powershell +[Environment]::SetEnvironmentVariable("HTTPS_PROXY", $null, "User") +``` --- ## Usage ```bash -# Desktop UI (default) -egressor +egressor # desktop UI (default) +egressor --headless # terminal only, no window +egressor --config /path/to/config.yaml # custom config file +egressor --generate-ca # generate CA certificate and exit +egressor --version # print version +``` -# Headless mode (terminal only) -egressor --headless +Egressor looks for its config file in this order: -# Custom config -egressor --config /path/to/config.yaml +1. `--config` flag (if provided) +2. `./config.yaml` (current directory) +3. `~/.egressor/config.yaml` (created automatically on first run) -# Generate CA manually -egressor --generate-ca +--- -# Print version -egressor --version -``` +## How blocking works -### Config file resolution +Egressor checks every outbound request through four layers, in order. Each layer serves a different purpose: -1. `--config` flag (explicit override) -2. `./config.yaml` (current directory) -3. `~/.egressor/config.yaml` (home directory) +### 1. Directory scope (`allowed_directories`) ---- +Restricts which parts of the filesystem tools can access. If you set this to your project directory, any file reference outside that directory is blocked immediately. This catches tools trying to read `~/.ssh/id_rsa`, `/etc/passwd`, or files from other projects. + +```yaml +allowed_directories: + - "~/Projects/my-app" +``` + +Leave empty to allow all directories (default). + +### 2. File patterns (`deny_file_patterns`) -## Configuration +Blocks requests that reference files matching glob patterns. This catches sensitive files regardless of where they are — even inside your allowed directories. ```yaml -listen_address: "127.0.0.1:8080" - -policy: - # Block requests that reference files outside these directories. - # If empty, no directory scope is enforced. - allowed_directories: - # - "~/Projects/my-app" - - deny_file_patterns: - - "*.env" - - "*.pem" - - "*.key" - - "**/secrets/**" - - "**/credentials*" - - ".aws/*" - - # Keywords that trigger interactive approval - deny_content_keywords: - - "CONFIDENTIAL" - - "INTERNAL ONLY" - -logging: - format: json - file: ~/.egressor/logs/audit.log - max_size_mb: 2 +deny_file_patterns: + - "*.env" # environment files + - "*.pem" # certificates + - "*.key" # private keys + - "**/secrets/**" # anything under a secrets/ directory + - "**/credentials*" # credential files + - ".aws/*" # AWS config +``` -intercept: - ca_cert: ~/.egressor/ca.pem - ca_key: ~/.egressor/ca-key.pem - log_body: true - max_body_size: 1048576 # 1MB +| Pattern | What it matches | +|----------------------|---------------------------------------| +| `*.env` | `.env`, `config/.env` | +| `*.pem` | `ca.pem`, `path/to/cert.pem` | +| `**/secrets/**` | `config/secrets/db.yaml` | +| `**/credentials*` | `home/credentials.json` | +| `.aws/*` | `.aws/config`, `.aws/credentials` | + +### 3. Content tags (`deny_content_tags`) — hard block + +Developers can mark individual files to prevent them from being sent to LLMs by adding a tag as a comment: + +```go +// NO_LLM +package internal +``` + +```python +# NO_LLM +class TradeSecret: +``` + +```yaml +# NO_LLM +api_keys: + production: sk-... ``` -### Allowed directories +When Egressor detects a tag in the request body, it blocks immediately. No prompt, no whitelist bypass. The developer said no. -When `allowed_directories` is set, any request containing file references outside those directories is blocked. This prevents LLM tools from accessing files beyond the intended project scope (e.g. `~/.ssh`, `/etc/passwd`, `~/.aws/credentials`). +```yaml +deny_content_tags: + - "NO_LLM" +``` -- Relative paths are resolved against the current working directory -- Path traversals (`../`) are cleaned before evaluation -- Multiple directories can be specified -- Leave empty to allow all directories (default) +### 4. Content keywords (`deny_content_keywords`) — interactive -### Deny file patterns +For content that might be sensitive but needs human judgment. When a keyword is found, Egressor pauses the request and shows a prompt in the desktop UI: -Glob patterns that block requests containing matching file references: +- **Allow Once** — send this time, don't remember +- **Allow Always** — send and add the file to `content_whitelist` (won't ask again) +- **Block Once** — reject this time, don't remember +- **Block Always** — reject and add the file to `content_blacklist` (auto-block next time) -| Pattern | Matches | -|---------|---------| -| `*.env` | `.env`, `config/.env` | -| `*.pem` | `ca.pem`, `path/to/cert.pem` | -| `**/secrets/**` | `config/secrets/db.yaml` | -| `**/credentials*` | `home/credentials.json` | -| `.aws/*` | `.aws/config`, `.aws/credentials` | +```yaml +deny_content_keywords: + - "CONFIDENTIAL" + - "INTERNAL ONLY" -When a request body contains a file matching a deny pattern, Egressor returns `403` to the client and logs the blocked request — the payload never reaches the LLM. +# Auto-managed by the UI when users choose "Allow Always" or "Block Always" +content_whitelist: [] +content_blacklist: [] +``` -### Content keywords (interactive approval) +In headless mode (no UI), keyword matches are blocked by default. -When `deny_content_keywords` is set, request bodies are scanned for these keywords (case-insensitive). If a match is found, Egressor pauses the request and prompts the user in the desktop UI with four options: +### Content type filtering -- **Allow Once** — forward this request, don't remember -- **Allow Always** — forward and add the file to the whitelist (never ask again) -- **Block Once** — return 403, don't remember -- **Block Always** — return 403 and add the file to the blacklist (auto-block in future) +Binary and encoded content is skipped during scanning to avoid false positives. You can customize which content types to skip: -The whitelist and blacklist are persisted to `config.yaml` via "Save to config". In headless mode, keyword matches are blocked by default (no UI to prompt). +```yaml +intercept: + skip_content_types: + - "image/*" + - "audio/*" + - "video/*" + - "application/octet-stream" + - "application/zip" + - "application/gzip" + - "application/pdf" +``` --- @@ -219,12 +274,13 @@ The whitelist and blacklist are persisted to `config.yaml` via "Save to config". The default mode opens a native desktop window (built with Wails + React): -- **Sessions tab** — live table of intercepted connections with method, host, status, file count -- **Detail panel** — click a session to see full request/response headers, body (JSON-formatted), and detected files -- **Policy tab** — manage allowed directories, deny file patterns, content keywords, whitelist/blacklist +- **Sessions tab** — live table of intercepted connections showing method, host, status, and file count +- **Detail panel** — click a session to inspect full request/response headers, body (formatted JSON), and detected files +- **Policy tab** — manage all policy rules: allowed directories, deny patterns, content tags, content keywords, whitelist, and blacklist - **Bottom bar** — proxy start/stop, pause/resume policy, session stats +- **System tray** (macOS) — menu bar icon with status, pause/resume, and quit -Blocked requests are highlighted in red with the matching deny pattern shown. +Blocked requests show up in red with the reason displayed. --- @@ -233,27 +289,28 @@ Blocked requests are highlighted in red with the matching deny pattern shown. Egressor performs a TLS man-in-the-middle on every HTTPS connection: ``` -Client ──TLS(egressor cert)──► Egressor ──TLS(real cert)──► Server +Client ──TLS(egressor cert)──> Egressor ──TLS(real cert)──> Server ``` 1. Client sends `CONNECT api.anthropic.com:443` 2. Egressor opens a TCP connection to the real server -3. Egressor presents a dynamically generated certificate to the client (signed by its CA) +3. Egressor presents a dynamically generated certificate to the client (signed by its local CA) 4. Egressor opens its own TLS connection to the real server -5. Sitting between two decrypted streams, it reads the plaintext HTTP request +5. Between two decrypted streams, it reads the plaintext HTTP request 6. Extracts file references from the JSON payload -7. Checks file paths against `allowed_directories` — blocks if out of scope -8. Checks file paths against `deny_file_patterns` — blocks if matched -9. Scans body for `deny_content_keywords` — prompts user if matched (whitelist/blacklist checked first) -10. If blocked: returns `403`, logs the attempt, never forwards to the server -11. If allowed: forwards the request, captures the response, logs everything +7. Checks `allowed_directories` — blocks if any file is out of scope +8. Checks `deny_file_patterns` — blocks if any file path matches +9. Checks `deny_content_tags` — blocks if body contains a tag like `NO_LLM` +10. Checks `deny_content_keywords` — prompts user if body contains a keyword (checks whitelist/blacklist first) +11. If blocked: returns 403, logs the attempt, never forwards to the server +12. If allowed: forwards the request, captures the response, logs everything ### File detection -Egressor scans request bodies for file references using: +Egressor scans request bodies for file references using multiple strategies: - **JSON field keys**: `path`, `file_path`, `filePath`, `filename`, `source`, `uri` -- **JSON string values** that look like file paths +- **JSON string values** that look like file paths (contain `/` and a file extension) - **Markdown code fences**: `` ```go:cmd/main.go `` - **XML-style tags**: ``, `lib/auth.rb` - **Text patterns**: `File: src/main.py`, `from src/handler.ts` @@ -262,7 +319,7 @@ Egressor scans request bodies for file references using: ## Audit Logs -Sessions are logged as newline-delimited JSON to `~/.egressor/logs/audit.log` with 2MB rotation: +Every intercepted session is logged as newline-delimited JSON to `~/.egressor/logs/audit.log`. The log file is rotated automatically when it exceeds `max_size_mb` (default 2MB). ```json { @@ -294,9 +351,11 @@ Sessions are logged as newline-delimited JSON to `~/.egressor/logs/audit.log` wi cmd/egressor/main.go Entry point, config resolution, mode selection internal/ proxy/ - proxy.go TCP listener, CONNECT handler, lifecycle control - intercept.go TLS MITM, HTTP relay, file extraction, policy check - policy/policy.go File pattern matching engine + proxy.go TCP listener, CONNECT handler, lifecycle + intercept.go TLS MITM, HTTP relay, policy enforcement + policy/ + policy.go Scope, patterns, tags, keywords engine + prompt.go Interactive prompt types and resolver audit/ session.go Session and exchange data models logger.go JSON logger with size-based rotation @@ -304,13 +363,18 @@ internal/ observer.go SessionSink interface, MultiSink fan-out ca/ ca.go CA generation and loading - cert.go Leaf certificate cache (LRU) - extract/files.go File reference extraction from payloads - config/config.go YAML config with defaults and ~ expansion + cert.go Leaf certificate LRU cache + extract/ + files.go File reference extraction from payloads + config/ + config.go YAML config with defaults and ~ expansion + tray/ + tray.go macOS system tray (menu bar icon) ui/ - app.go Wails-bound app (session queries, policy, proxy control) - ui.go Wails window runner + app.go Wails-bound app (sessions, policy, prompts) + ui.go Wails window runner + embedded assets frontend/ React + TypeScript + Tailwind CSS +config.yaml Default configuration (well-commented) ``` --- @@ -329,10 +393,10 @@ internal/ # Build frontend cd internal/ui/frontend && npm install && npm run build && cd ../../.. -# Build binary (macOS with Wails UI) +# Build binary (macOS with desktop UI) CGO_LDFLAGS="-framework UniformTypeIdentifiers" go build -tags production -o egressor ./cmd/egressor -# Build headless only (no CGO required) +# Build headless only (any platform, no CGO) go build -o egressor ./cmd/egressor ``` diff --git a/cmd/egressor/main.go b/cmd/egressor/main.go index e0857dc..acd4095 100644 --- a/cmd/egressor/main.go +++ b/cmd/egressor/main.go @@ -82,7 +82,7 @@ func main() { fmt.Println() } engine := policy.NewEngine(cfg.Policy) - interceptor := proxy.NewInterceptor(authority, cfg.Intercept.LogBody, cfg.Intercept.MaxBodySize, engine) + interceptor := proxy.NewInterceptor(authority, cfg.Intercept.LogBody, cfg.Intercept.MaxBodySize, engine, cfg.Intercept.SkipContentTypes) slog.Info("TLS interception enabled") if *headless { @@ -111,39 +111,6 @@ func fileExists(path string) bool { return err == nil } -const defaultConfig = `listen_address: "127.0.0.1:8080" - -policy: - # Block requests that reference files outside these directories. - # If empty, no directory scope is enforced. - allowed_directories: - # - "~/Projects/my-app" - - deny_file_patterns: - - "*.env" - - "*.pem" - - "*.key" - - "**/secrets/**" - - "**/credentials*" - - ".aws/*" - - # Keywords that trigger interactive approval before sending to LLM. - # deny_content_keywords: - # - "CONFIDENTIAL" - # - "INTERNAL ONLY" - -logging: - format: json - file: ~/.egressor/logs/audit.log - max_size_mb: 2 - -intercept: - ca_cert: ~/.egressor/ca.pem - ca_key: ~/.egressor/ca-key.pem - log_body: true - max_body_size: 1048576 -` - func resolveConfigPath(explicit string) string { if explicit != "" { return explicit @@ -158,15 +125,22 @@ func resolveConfigPath(explicit string) string { if _, err := os.Stat(p); err == nil { return p } - // 3. Auto-create default config at ~/.egressor/config.yaml - if err := os.MkdirAll(filepath.Dir(p), 0o755); err == nil { - if err := os.WriteFile(p, []byte(defaultConfig), 0o644); err == nil { - fmt.Printf(" Default config created at %s\n\n", p) - return p - } - } } - return "config.yaml" + // No config found — tell the user how to set one up + home, _ := os.UserHomeDir() + defaultPath := filepath.Join(home, ".egressor", "config.yaml") + fmt.Println("No config file found. Egressor looked in:") + fmt.Println(" 1. ./config.yaml") + fmt.Printf(" 2. %s\n", defaultPath) + fmt.Println() + fmt.Println("To get started, create a config file:") + fmt.Printf(" mkdir -p %s\n", filepath.Dir(defaultPath)) + fmt.Printf(" cp config.yaml %s\n", defaultPath) + fmt.Println() + fmt.Println("Or specify a path directly:") + fmt.Println(" egressor --config /path/to/config.yaml") + os.Exit(1) + return "" } func runHeadless(server *proxy.Server, cfg *config.Config) { diff --git a/config.yaml b/config.yaml index fa60c5f..5f15a4a 100644 --- a/config.yaml +++ b/config.yaml @@ -1,31 +1,105 @@ +# Egressor Configuration +# See https://github.com/ehsaniara/egressor for full documentation. + +# Address the proxy listens on. Only binds to localhost for security. listen_address: "127.0.0.1:8080" policy: - # Block requests that reference files outside these directories. - # If empty, no directory scope is enforced. + + # --- Scope --- + # Restrict which directories LLM tools can access. + # Any file reference outside these directories will be blocked immediately. + # This prevents tools from reading ~/.ssh, /etc/passwd, or other projects. + # Leave empty to allow all directories (default). allowed_directories: # - "~/Projects/my-app" + # --- File patterns (hard block) --- + # Glob patterns that match sensitive file paths. + # If a request references a file matching any of these, it's blocked + # with a 403 — no prompt, no exceptions. + # Supports: * (any chars), ** (any depth), ? (single char) deny_file_patterns: - - "*.env" - - "*.pem" - - "*.key" - - "**/secrets/**" - - "**/credentials*" - - ".aws/*" - - # Keywords that trigger interactive approval before sending to LLM. + - "*.env" # environment files (.env, .env.local) + - "*.pem" # certificates + - "*.key" # private keys + - "**/secrets/**" # any file under a secrets/ directory + - "**/credentials*" # credential files (credentials.json, etc.) + - ".aws/*" # AWS config and credentials + + # --- File tags (hard block) --- + # Developers can add these tags as comments at the top of any file + # to prevent it from being sent to LLMs. For example: + # + # // NO_LLM (Go, JS, Java, C) + # # NO_LLM (Python, Ruby, YAML, Shell) + # (HTML, XML) + # + # When Egressor detects a tag in the request body, it blocks immediately. + # No prompt, no whitelist bypass — the developer said no. + deny_content_tags: + - "NO_LLM" + + # --- Content keywords (interactive) --- + # When a request body contains any of these keywords, Egressor pauses + # the request and shows a prompt in the desktop UI asking what to do. + # Unlike tags above, keywords give the user a choice: + # + # Allow Once — send this time, don't remember + # Allow Always — send and add the file to content_whitelist + # Block Once — reject this time, don't remember + # Block Always — reject and add the file to content_blacklist + # + # In headless mode (no UI), keyword matches are blocked by default. deny_content_keywords: - "CONFIDENTIAL" - "INTERNAL ONLY" + # --- Whitelist / Blacklist (auto-managed) --- + # These lists are populated automatically when users choose + # "Allow Always" or "Block Always" from the keyword prompt. + # You can also edit them by hand if needed. + # + # Whitelisted files skip keyword prompts — they're always allowed through. + # Blacklisted files are blocked automatically when a keyword matches. + content_whitelist: [] + content_blacklist: [] + logging: + # Log format. Currently only "json" is supported. format: json + + # Where audit logs are written. Each intercepted session is logged as + # a single JSON line. Rotated automatically when the file exceeds max_size_mb. file: ~/.egressor/logs/audit.log + + # Maximum log file size in MB before rotation. Rotated files are kept + # with a unix timestamp suffix (e.g. audit.log.1712345678). max_size_mb: 2 intercept: + # Path to the CA certificate and key used for TLS interception. + # Auto-generated on first run if they don't exist. ca_cert: ~/.egressor/ca.pem ca_key: ~/.egressor/ca-key.pem + + # Whether to capture request/response bodies in audit logs. + # Useful for debugging, but logs may contain sensitive data. log_body: true - max_body_size: 1048576 # 1MB \ No newline at end of file + + # Maximum body size to capture in logs (in bytes). Bodies larger than + # this are truncated in logs but still forwarded to the upstream server. + max_body_size: 1048576 # 1MB + + # Content types to skip during scanning. Egressor only scans text-based + # payloads — it has no OCR or binary decoding, so images, archives, and + # other non-text formats are skipped entirely to avoid false positives. + # Supports wildcards (e.g. "image/*" matches image/png, image/jpeg, etc.) + skip_content_types: + - "image/*" + - "audio/*" + - "video/*" + - "application/octet-stream" + - "application/zip" + - "application/gzip" + - "application/pdf" \ No newline at end of file diff --git a/docs/components.md b/docs/components.md new file mode 100644 index 0000000..19be0ef --- /dev/null +++ b/docs/components.md @@ -0,0 +1,137 @@ +# Components + +A closer look at how each part of Egressor works. + +## Proxy Server (`internal/proxy/proxy.go`) + +The front door for all traffic. Binds to `127.0.0.1:8080` (configurable) and listens for HTTP CONNECT requests. When a client connects, the proxy: + +1. Parses the target host and port from the CONNECT request +2. Dials a TCP connection to the real upstream server (5-second timeout) +3. Responds `200 Connection Established` to the client +4. Hands both connections to the TLS Interceptor + +The server supports `Start()`, `Stop()`, and `IsRunning()` so the desktop UI can control it. In headless mode, it runs until interrupted with Ctrl+C. + +## TLS Interceptor (`internal/proxy/intercept.go`) + +The core of Egressor. Every HTTPS connection is intercepted -- there's no pass-through mode. + +For each connection, the interceptor: + +1. **TLS-terminates the client side** using a dynamically generated certificate signed by the local CA. The client thinks it's talking to the real server. +2. **TLS-connects to the real upstream server** using the server's actual certificate. The server thinks it's talking directly to the client. +3. **Relays HTTP requests** in a loop between the two decrypted streams. + +For each request in the relay loop: +- The full body is buffered into memory before forwarding (capped at `max_body_size`) +- If the `Content-Type` matches `skip_content_types`, scanning is skipped +- Otherwise, file references are extracted and the four policy layers are checked +- If blocked: a 403 response is sent back over the TLS connection and the exchange is logged +- If allowed: the request is forwarded, the response is captured and relayed back + +The interceptor holds a `PromptResolver` interface. In desktop mode, this is wired to the `App` struct which emits Wails events and blocks on a Go channel. In headless mode, it's a `HeadlessResolver` that blocks everything. + +Key design decision: the body is fully buffered before forwarding. This adds latency, but it's the only way to inspect the payload before it leaves the machine. For typical LLM API requests (JSON payloads under 1MB), this isn't noticeable. + +## File Extraction (`internal/extract/files.go`) + +Scans request bodies for file references. LLM API payloads (from tools like Claude Code and Cursor) embed file contents in JSON. Egressor needs to figure out which files are being sent, even though every tool formats its payloads differently. + +The extraction uses two approaches: + +**JSON field scanning:** Parses the body as JSON and walks the tree looking for: +- Keys that typically hold file paths: `path`, `file_path`, `filePath`, `filename`, `source`, `uri` +- String values that look like file paths (contain `/` or `\` and have a file extension) +- Filters out URLs (starting with `http://` or `https://`) + +**Text pattern matching:** For file references embedded in longer text content (like code blocks or markdown), uses regex patterns: +- Markdown code fences: `` ```go:cmd/main.go `` +- XML-style tags: ``, `lib/auth.rb` +- Inline references: `File: src/main.py`, `from src/handler.ts` + +Results are deduplicated. Each detected file gets a `source` label (`"json_field"` or `"text_pattern"`) for the audit log. + +## Policy Engine (`internal/policy/policy.go`, `prompt.go`) + +See [policy.md](policy.md) for the full breakdown of the four layers. + +The engine is a single `Engine` struct that holds all policy state and provides thread-safe evaluation methods. All state is protected by a `sync.RWMutex` -- reads can happen concurrently, writes are exclusive. + +Each policy layer has its own evaluation method that returns a `Decision` (allowed/denied with a reason). The interceptor calls them in sequence and stops at the first denial. + +The engine also supports a bypass toggle (`atomic.Bool`). When bypassed, all evaluations return "allowed" without checking anything. + +Runtime mutations (adding/removing patterns, directories, keywords, whitelist/blacklist entries) take effect immediately without restart. The UI calls these methods via Wails bindings. + +## Certificate Authority (`internal/ca/`) + +**`ca.go`** -- Generates and loads the root CA certificate. + +On first run, Egressor creates a self-signed ECDSA P-256 root CA with 10-year validity. The certificate and key are written to `~/.egressor/ca.pem` and `~/.egressor/ca-key.pem`. The key is stored with `0600` permissions (owner read/write only). + +On subsequent runs, the existing CA is loaded from disk. + +**`cert.go`** -- Dynamic leaf certificate cache. + +When a client connects to `api.anthropic.com:443`, the interceptor needs a certificate for that hostname. The `CertCache` generates one on the fly -- signed by the root CA -- and caches it in an LRU cache (1024 entries, 24-hour validity). + +Implements Go's `tls.Config.GetCertificate` interface so it plugs directly into the TLS server config. Supports both DNS names and IP address SANs. + +## Audit System (`internal/audit/`) + +**`session.go`** -- Data models for `Session` (a single proxied connection) and `InterceptedExchange` (a request/response pair within a session). Sessions track timing, target host, dial status, and a list of exchanges. Each exchange records the method, URL, headers, body, detected files, and whether it was blocked. + +**`logger.go`** -- Writes complete sessions as newline-delimited JSON to `~/.egressor/logs/audit.log`. When the file exceeds `max_size_mb`, it's rotated by renaming to `audit.log.`. Rotated files accumulate -- there's no automatic cleanup. + +**`store.go`** -- An in-memory ring buffer that holds the last 1000 sessions for the desktop UI. Supports observer callbacks: when a new session is added, all registered observers are notified. The UI uses this to push real-time updates to the frontend. + +**`observer.go`** -- `SessionSink` interface (`Log(*Session)`) with a `MultiSink` implementation that fans out to multiple sinks. In practice, this means every session goes to both the file logger and the in-memory store. The proxy doesn't need to know about either -- it just calls `sink.Log(session)`. + +## Desktop UI (`internal/ui/`, `internal/tray/`) + +### Go layer + +**`app.go`** -- The Wails-bound application struct. Every public method is callable from JavaScript. Handles: +- Session queries: `GetRecentSessions()`, `GetSession()`, `GetStats()` +- Policy management: CRUD methods for all rule types (patterns, directories, tags, keywords, whitelist, blacklist) +- Content keyword prompts: implements `PromptResolver` by emitting a Wails event and blocking on a Go channel with a 30-second timeout +- Config persistence: `SaveConfig()` writes the current policy state back to the YAML file + +**`ui.go`** -- Configures and starts the Wails window. Frontend assets (the compiled React app) are embedded in the binary via `//go:embed`. + +### System tray (`internal/tray/`) + +**`tray.go`** (macOS only) -- Adds an icon to the macOS menu bar using `energye/systray`. The menu shows: +- Status: Running / Paused +- Pause / Resume toggle (syncs with the policy bypass toggle) +- Quit (stops the proxy and closes the app) + +**`tray_stub.go`** -- No-op implementation for non-macOS platforms. The `Available()` function returns false so the app knows not to try. + +### React frontend (`internal/ui/frontend/`) + +Built with React 19, TypeScript, Tailwind CSS, and Vite. Communicates with the Go backend through Wails-generated bindings. + +**Main views:** +- **Sessions tab** -- A live table of intercepted connections. Shows method, host, path, status, file count, and duration. Updates in real-time via the `session:new` event. +- **Detail panel** -- Click a session to see the full request and response side by side. Shows headers, body (formatted JSON), detected files, and block reasons. +- **Policy tab** -- Manage all policy rules in one place. Each rule type has its own section with add/remove controls. +- **Bottom bar** -- Proxy start/stop, policy pause/resume, and live stats (total sessions, blocked, files detected). + +**Interactive prompt:** +- `ContentPromptModal.tsx` -- A full-screen modal that appears when a content keyword match needs user input. Shows the matched keyword, target URL, and affected files. Four action buttons: Allow Once, Allow Always, Block Once, Block Always. Has a 30-second countdown timer -- if the user doesn't respond, the request is blocked. +- `useContentPrompts.ts` -- React hook that queues incoming prompts and handles resolution. + +## Configuration (`internal/config/config.go`) + +YAML format with sensible defaults for everything. Supports `~` expansion for file paths. The `Load()` function applies defaults for missing values (like `max_body_size` and `skip_content_types`). + +The `Save()` function writes the config back to disk -- used by the UI's "Save to config" button to persist policy changes. + +Config resolution order: +1. `--config` flag (explicit path) +2. `./config.yaml` (current directory) +3. `~/.egressor/config.yaml` (home directory -- auto-created with defaults on first run) + +See `config.yaml` in the project root for a fully commented example of every option. \ No newline at end of file diff --git a/docs/data-flow.md b/docs/data-flow.md new file mode 100644 index 0000000..46254e2 --- /dev/null +++ b/docs/data-flow.md @@ -0,0 +1,117 @@ +# Data Flow + +Step-by-step examples of how requests flow through Egressor in different scenarios. + +## Allowed request + +Everything checks out -- the request is forwarded to the LLM API. + +``` +1. Client sends CONNECT api.anthropic.com:443 +2. Proxy dials TCP to api.anthropic.com:443 +3. Proxy responds 200 Connection Established +4. Interceptor does TLS handshake with client (using a dynamic cert) +5. Interceptor does TLS handshake with upstream (using the real cert) +6. Interceptor reads the HTTP request and buffers the body +7. Content-Type is application/json -- not skipped +8. File extraction finds ["src/main.go"] in the JSON payload +9. EvaluateScope: src/main.go is inside ~/Projects/my-app -- passes +10. EvaluateFiles: no deny pattern matches src/main.go -- passes +11. EvaluateContentTags: body doesn't contain NO_LLM -- passes +12. EvaluateContentKeywords: body doesn't contain CONFIDENTIAL -- passes +13. Request forwarded to api.anthropic.com +14. Response received, forwarded back to client +15. Session logged to audit.log as JSON +16. Session pushed to UI via "session:new" event +``` + +## Blocked by directory scope + +A tool tries to read a file outside the allowed project directory. + +``` +1-8. Same as above, but file extraction finds ["/etc/passwd"] +9. EvaluateScope: /etc/passwd is outside ~/Projects/my-app +10. Interceptor sends 403 to client: "file is outside allowed directories" +11. Session logged with blocked=true +12. UI shows the session in red +``` + +## Blocked by file pattern + +A request references a `.env` file that matches a deny pattern. + +``` +1-8. Same as above, but file extraction finds [".env"] +9. EvaluateScope: .env is inside the allowed directory -- passes +10. EvaluateFiles: ".env" matches pattern "*.env" +11. Interceptor sends 403 to client +12. Session logged with blocked=true, reason: matches "*.env" +``` + +## Blocked by content tag + +A developer added `// NO_LLM` to a file, and a tool tried to send it. + +``` +1-8. Same as above, file extraction finds ["internal/secrets.go"] +9. EvaluateScope: file is in scope -- passes +10. EvaluateFiles: no pattern match -- passes +11. EvaluateContentTags: body contains "NO_LLM" +12. Interceptor sends 403 to client +13. Session logged with blocked=true +``` + +The developer said no, and Egressor enforced it. No prompt, no override. + +## Interactive keyword prompt + +A request body contains the word "CONFIDENTIAL" and the files haven't been whitelisted or blacklisted. + +``` +1-8. Same as above, file extraction finds ["report.md"] +9-11. Scope, patterns, tags all pass +12. EvaluateContentKeywords: body contains "CONFIDENTIAL" + - report.md is not in whitelist or blacklist -> needs prompt +13. Interceptor calls resolver.PromptUser() and blocks +14. App emits "content:prompt" event to the React frontend +15. Frontend shows a modal: + "Request contains keyword CONFIDENTIAL" + Files: report.md + [Allow Once] [Allow Always] [Block Once] [Block Always] +16. User clicks "Block Always" +17. Frontend calls ResolveContentPrompt() to unblock the interceptor + Frontend calls ResolveContentPromptForFile() to add report.md to blacklist +18. App adds report.md to blacklist, sends decision to channel +19. Interceptor receives the block decision, sends 403 to client +20. Session logged with blocked=true +``` + +Next time a request references `report.md` and contains a keyword, it's auto-blocked without prompting. + +## Keyword prompt with whitelist hit + +Same as above, but the file was previously approved. + +``` +1-8. Same as above, file extraction finds ["report.md"] +9-11. Scope, patterns, tags all pass +12. EvaluateContentKeywords: body contains "CONFIDENTIAL" + - report.md is in whitelist -> auto-allowed + - No files need prompting +13. Request forwarded to upstream normally +``` + +No prompt, no delay. The user already said this file is fine. + +## Binary content skipped + +A request contains an image payload that shouldn't be scanned. + +``` +1-6. Same as above +7. Content-Type is image/png -- matches skip_content_types +8. File extraction and all policy checks are skipped +9. Request forwarded to upstream directly +10. Session logged (body not scanned) +``` \ No newline at end of file diff --git a/docs/design.md b/docs/design.md index b8643b1..f923420 100644 --- a/docs/design.md +++ b/docs/design.md @@ -1,8 +1,16 @@ -# Egressor — Application Design +# Egressor -- Design Overview -## Architecture Overview +Egressor is a local HTTPS proxy that sits between developer tools (Claude Code, Kiro, Cursor) and LLM APIs. It intercepts every outbound request, inspects the payload for file references and sensitive content, enforces configurable blocking rules, and logs everything. -Egressor is a local HTTPS intercepting proxy that monitors and controls outbound traffic from developer tools. Every HTTPS connection is TLS-terminated, inspected for file references, checked against deny policies, and logged. +The goal is simple: give developers visibility and control over what their AI tools send to external servers. + +For details on specific topics, see: + +- [Policy layers](policy.md) -- how blocking decisions are made +- [Data flow](data-flow.md) -- step-by-step request flows for each scenario +- [Components](components.md) -- how each part of the system works + +## Architecture ``` ┌─────────────────┐ @@ -10,226 +18,89 @@ Egressor is a local HTTPS intercepting proxy that monitors and controls outbound │ HTTPS_PROXY set │ └────────┬────────┘ │ CONNECT host:port HTTP/1.1 - ▼ -┌─────────────────────────────────────────────┐ -│ Egressor │ -│ │ -│ ┌──────────┐ │ -│ │ Proxy │ Accept CONNECT, dial upstream │ -│ │ Listener │ │ -│ └────┬─────┘ │ -│ │ │ -│ ▼ │ -│ ┌──────────────┐ │ -│ │ TLS │ MITM: dynamic certs │ -│ │ Interceptor │ HTTP/1.1 relay loop │ -│ └────┬─────────┘ │ -│ │ │ -│ ├──▶ Extract file references │ -│ ├──▶ Check allowed_directories │ -│ │ └─ OUT OF SCOPE → 403 │ -│ ├──▶ Check deny_file_patterns │ -│ │ └─ BLOCKED → 403 to client │ -│ ├──▶ Check deny_content_keywords │ -│ │ ├─ WHITELIST → auto-allow │ -│ │ ├─ BLACKLIST → auto-block 403 │ -│ │ └─ PROMPT USER → allow/block │ -│ │ │ -│ └──▶ ALLOWED → forward upstream │ -│ │ │ -│ ▼ │ -│ ┌──────────┐ ┌───────────────┐ │ -│ │ Audit │ │ Session │ │ -│ │ Logger │ │ Store (ring) │──▶ Wails UI │ -│ └──────────┘ └───────────────┘ │ -│ │ -│ ┌──────────┐ │ -│ │ Desktop │ Wails + React │ -│ │ UI │ Sessions / Policy / Controls │ -│ └──────────┘ │ -└─────────────────────────────────────────────┘ + v +┌───────────────────────────────────────────────┐ +│ Egressor │ +│ │ +│ ┌──────────┐ │ +│ │ Proxy │ Accept CONNECT, dial upstream │ +│ │ Listener │ │ +│ └────┬─────┘ │ +│ │ │ +│ v │ +│ ┌──────────────┐ │ +│ │ TLS │ MITM with dynamic certs │ +│ │ Interceptor │ HTTP/1.1 relay loop │ +│ └────┬─────────┘ │ +│ │ │ +│ ├──> Skip binary content types │ +│ ├──> Extract file references │ +│ ├──> Check allowed_directories │ +│ │ └─ OUT OF SCOPE -> 403 │ +│ ├──> Check deny_file_patterns │ +│ │ └─ PATTERN MATCH -> 403 │ +│ ├──> Check deny_content_tags │ +│ │ └─ TAG FOUND (e.g. NO_LLM) -> 403 │ +│ ├──> Check deny_content_keywords │ +│ │ ├─ WHITELIST -> auto-allow │ +│ │ ├─ BLACKLIST -> auto-block 403 │ +│ │ └─ PROMPT USER -> allow/block │ +│ │ │ +│ └──> ALLOWED -> forward upstream │ +│ │ │ +│ v │ +│ ┌──────────┐ ┌───────────────┐ │ +│ │ Audit │ │ Session │ │ +│ │ Logger │ │ Store (ring) │──> Wails UI │ +│ └──────────┘ └───────────────┘ │ +│ │ +│ ┌──────────┐ ┌──────────┐ │ +│ │ Desktop │ │ System │ │ +│ │ UI │ │ Tray │ (macOS menu bar) │ +│ └──────────┘ └──────────┘ │ +└───────────────────────────────────────────────┘ │ - ▼ + v ┌─────────────────┐ │ Remote Endpoint │ (api.anthropic.com, etc.) └─────────────────┘ ``` -## Components - -### Proxy Server (`internal/proxy/proxy.go`) - -- Binds to `127.0.0.1:8080` (configurable) -- Accepts HTTP CONNECT requests -- Dials upstream TCP connection (5s timeout) -- Passes both connections to the TLS Interceptor -- Supports `Start()` / `Stop()` / `IsRunning()` for UI-driven lifecycle - -### TLS Interceptor (`internal/proxy/intercept.go`) - -All connections are intercepted — there is no pass-through tunnel mode. - -For each connection: -1. TLS-terminate the client side with a dynamic certificate (from cert cache) -2. TLS-connect to the real upstream server -3. HTTP/1.1 relay loop: - - Read full request body into buffer - - Extract file references from the body - - Evaluate file paths against `allowed_directories` — block if out of scope - - Evaluate file paths against `deny_file_patterns` — block if matched - - Scan body for `deny_content_keywords` — check whitelist/blacklist, prompt user if needed - - If blocked: send 403 back to client, log, stop - - If allowed: forward request to upstream, relay response back -4. Record exchange in session - -Key design: the body is fully buffered before forwarding to enable file detection and policy enforcement before the request reaches the LLM. - -### File Extraction (`internal/extract/files.go`) - -Scans intercepted request bodies for file references. Handles: - -- **JSON fields**: walks parsed JSON looking for keys like `path`, `file_path`, `filename`, `source`, `uri` and string values that look like file paths -- **Text patterns**: regex matching for markdown code fences (`` ```lang:path ``), XML tags (``), and text references (`File: path`, `from path/to/file`) -- Deduplicates results, filters out URLs, validates file extensions - -Returns `[]FileRef{Path, Source}` where Source is `"json_field"` or `"text_pattern"`. - -### Policy Engine (`internal/policy/policy.go`) - -Two-layer policy enforcement: - -**Directory scope** — `EvaluateScope(paths []string) Decision`: -- Checks if file paths fall within `allowed_directories` -- Resolves relative paths against cwd, cleans `../` traversals -- If no directories configured, all paths are allowed (default) -- Runtime mutation: `GetAllowedDirectories()`, `SetAllowedDirectories()` - -**File pattern deny** — `EvaluateFiles(paths []string) Decision`: -- Checks paths against `deny_file_patterns` -- Pattern matching: `filepath.Match` for globs, `**/` prefix for recursive matching, basename fallback -- Runtime mutation: `GetDenyPatterns()`, `SetDenyPatterns()`, `AddDenyPattern()`, `RemoveDenyPattern()` - -**Content keyword approval** — `EvaluateContentKeywords(body string, filePaths []string) ContentKeywordResult`: -- Case-insensitive substring scan of body against `deny_content_keywords` -- Partitions files into whitelist-allowed, blacklist-blocked, and needs-prompt -- Interactive: pauses request, emits `content:prompt` event, waits for user decision (30s timeout) -- User choices: Allow Once, Allow Always (whitelist), Block Once, Block Always (blacklist) -- `PromptResolver` interface: `App` implements for UI mode, `HeadlessResolver` blocks by default -- Whitelist/blacklist persisted to config via SaveConfig - -All layers: -- Pause/bypass via atomic bool (for UI toggle) -- Thread-safe with `sync.RWMutex` - -### Certificate Authority (`internal/ca/`) - -**`ca.go`** — Self-signed ECDSA P-256 root CA: -- `LoadOrGenerate()`: loads from `~/.egressor/ca.pem` or auto-generates -- 10-year validity, `KeyUsageCertSign` -- Key stored with `0600` permissions +## How it works, briefly -**`cert.go`** — Dynamic leaf certificate cache: -- `GetCertificate(hello)`: implements `tls.Config.GetCertificate` -- LRU cache (1024 entries), 24-hour cert validity -- Generates per-hostname leaf certs signed by the CA -- Supports both DNS names and IP SANs - -### Audit Logger (`internal/audit/logger.go`) - -- Newline-delimited JSON to file (`~/.egressor/logs/audit.log`) -- Size-based rotation: when file exceeds `max_size_mb`, renames to `audit.log.` -- Rotated files accumulate indefinitely -- Mutex-protected for concurrent writes - -### Session Store (`internal/audit/store.go`) - -- In-memory ring buffer (1000 sessions) for the desktop UI -- `OnSession(fn)` observer callback — pushes new sessions to Wails frontend via events -- `Recent(limit)`, `GetByID(id)`, `Stats()` for UI queries -- Thread-safe with `sync.RWMutex` - -### Session Sink (`internal/audit/observer.go`) - -- `SessionSink` interface: `Log(*Session)` -- `MultiSink` fans out to both Logger (file) and SessionStore (UI) -- The proxy server accepts any `SessionSink`, keeping it decoupled from specific consumers - -### Desktop UI (`internal/ui/`) - -**Go layer:** -- `app.go` — Wails-bound struct, all public methods callable from frontend -- `ui.go` — Wails window configuration and runner -- Frontend assets embedded via `//go:embed all:frontend/dist` - -**React frontend** (`internal/ui/frontend/`): -- Sessions tab: live table with real-time updates via `EventsOn("session:new")` -- Detail panel: request/response inspector with JSON viewer, detected files, blocked indicator -- Policy tab: allowed directories and deny patterns with save-to-config -- Bottom bar: proxy controls, policy pause/resume, stats - -### Configuration (`internal/config/config.go`) - -- YAML format with sensible defaults -- `~` expansion for paths -- `Save()` for persisting UI changes back to file -- Config resolution: `--config` flag → `./config.yaml` → `~/.egressor/config.yaml` - -## Data Flow - -### Allowed request - -``` -1. Client → CONNECT api.anthropic.com:443 -2. Proxy: dial TCP to api.anthropic.com:443 -3. Proxy → Client: 200 Connection Established -4. Interceptor: TLS handshake with client (dynamic cert) -5. Interceptor: TLS handshake with upstream (real cert) -6. Interceptor: read HTTP request, buffer body -7. Extract: scan body → detected_files: ["src/main.go"] -8. Policy: EvaluateScope(["src/main.go"]) → in scope -9. Policy: EvaluateFiles(["src/main.go"]) → allowed -10. Interceptor: forward request to upstream -11. Interceptor: read response, forward to client -12. Logger: write session JSON to audit.log -13. Store: add session, emit "session:new" event → UI -``` - -### Blocked request - -``` -1-6. Same as above -7. Extract: scan body → detected_files: [".env"] -8. Policy: EvaluateScope([".env"]) → in scope (or blocked if outside allowed dirs) -9. Policy: EvaluateFiles([".env"]) → denied (matches "*.env") -10. Interceptor: send 403 back to client over TLS -11. Logger: write session with blocked=true, block_reason -12. Store: add session → UI shows red row -``` +1. Developer configures their tool to use `HTTPS_PROXY=http://127.0.0.1:8080` and trusts the Egressor CA +2. The tool sends a `CONNECT` request for the LLM API host +3. Egressor dials the real server, then performs a TLS man-in-the-middle using a dynamically generated certificate +4. With both sides decrypted, Egressor reads the HTTP request, buffers the body, and runs it through four policy layers +5. If any layer blocks, the request never leaves the machine -- the client gets a 403 +6. If everything passes, the request is forwarded normally and the response is relayed back +7. Every session is logged to disk and (in UI mode) pushed to the frontend in real-time -## Security Considerations +## Security considerations -- **CA key**: `0600` permissions, stored in `~/.egressor/` -- **Network scope**: binds to `127.0.0.1` only — not remotely accessible -- **Intercepted content**: full HTTP bodies logged — treat audit logs as sensitive -- **CA trust**: must be explicitly added to OS keychain by the user -- **Node.js tools**: require `NODE_EXTRA_CA_CERTS` pointing to the CA cert +- **CA key**: stored with `0600` permissions in `~/.egressor/` +- **Network scope**: binds to `127.0.0.1` only -- not accessible from other machines +- **Intercepted content**: full HTTP bodies can be logged -- treat audit logs as sensitive +- **CA trust**: must be explicitly added to the OS keychain by the user +- **No remote access**: there's no API server or remote management interface +- **Binary content**: Egressor has no OCR or binary decoding -- non-text content types are skipped entirely -## Project Structure +## Project structure ``` cmd/egressor/main.go Entry point, config resolution, mode selection internal/ proxy/ proxy.go TCP listener, CONNECT handler, lifecycle - intercept.go TLS MITM, HTTP relay, file extraction, blocking + intercept.go TLS MITM, HTTP relay, policy enforcement policy/ - policy.go Directory scope + deny pattern engine + policy.go Scope, patterns, tags, keywords engine + prompt.go PromptResolver interface, HeadlessResolver audit/ session.go Session, InterceptedExchange, FileRef models logger.go JSON file logger with rotation store.go In-memory ring buffer for UI observer.go SessionSink interface, MultiSink - auditfakes/ Counterfeiter-generated test fakes ca/ ca.go CA generation and loading cert.go Leaf certificate LRU cache @@ -237,26 +108,17 @@ internal/ files.go File reference extraction from payloads config/ config.go YAML config loader with defaults + tray/ + tray.go macOS system tray (menu bar icon) + tray_stub.go No-op for non-macOS platforms ui/ - app.go Wails-bound application struct + app.go Wails-bound app (sessions, policy, prompts) ui.go Wails window runner + embedded assets frontend/ React + TypeScript + Tailwind CSS - src/ - App.tsx Two-tab layout (Sessions / Policy) - components/ - SessionTable.tsx Live session list - SessionDetail.tsx Exchange inspector - RequestPane.tsx Request headers + body + files - ResponsePane.tsx Response headers + body - PolicyEditor.tsx Allowed dirs + deny pattern CRUD - ProxyControls.tsx Start/stop/pause + stats - JsonViewer.tsx Formatted JSON display - hooks/ - useSessions.ts Real-time session state + Wails events - usePolicy.ts Policy management -config.yaml Default configuration -.goreleaser-macos.yaml macOS build (Wails UI, amd64 + arm64) -.goreleaser-linux.yaml Linux build (headless, amd64 + arm64) -.goreleaser-windows.yaml Windows build (headless, amd64 + arm64) -.github/workflows/release.yml CI: tag → test → build → GitHub Release +config.yaml Default configuration (fully commented) +docs/ + design.md This file -- architecture overview + policy.md Policy layers and blocking rules + data-flow.md Step-by-step request flow examples + components.md Component details ``` \ No newline at end of file diff --git a/docs/policy.md b/docs/policy.md new file mode 100644 index 0000000..69e6df0 --- /dev/null +++ b/docs/policy.md @@ -0,0 +1,143 @@ +# Policy Layers + +Egressor checks every outbound request through four policy layers, in order. Each layer has a different purpose and a different level of user interaction. + +If any layer blocks the request, it stops there -- later layers aren't checked. + +## Layer 1: Directory scope (`allowed_directories`) + +**Type:** Hard block, no prompt + +This is the broadest control. It restricts which parts of the filesystem LLM tools can access. If you set this to your project directory, any file reference outside that scope is blocked immediately. + +This catches tools that try to read files they shouldn't -- like `~/.ssh/id_rsa`, `/etc/passwd`, `~/.aws/credentials`, or files from other projects on your machine. + +```yaml +allowed_directories: + - "~/Projects/my-app" +``` + +How it works: +- File paths detected in the request body are resolved to absolute paths +- Relative paths (like `../`) are cleaned before comparison +- Each path must fall within at least one allowed directory +- If no directories are configured, this layer is skipped (everything passes) + +## Layer 2: File patterns (`deny_file_patterns`) + +**Type:** Hard block, no prompt + +Catches sensitive files by name, even if they're inside your allowed directories. Your project probably has an `.env` file -- you still don't want it sent to an LLM. + +```yaml +deny_file_patterns: + - "*.env" # environment files (.env, .env.local) + - "*.pem" # certificates + - "*.key" # private keys + - "**/secrets/**" # anything under a secrets/ directory + - "**/credentials*" # credential files + - ".aws/*" # AWS config +``` + +How it works: +- Uses Go's `filepath.Match` for glob matching +- The `**/` prefix matches at any directory depth +- Also tries matching against just the filename (basename) +- Case-insensitive + +## Layer 3: Content tags (`deny_content_tags`) + +**Type:** Hard block, no prompt + +This gives developers a way to opt individual files out of LLM processing. Add a tag as a comment at the top of any file: + +```go +// NO_LLM +package internal +``` + +```python +# NO_LLM +class TradeSecret: + ... +``` + +```yaml +# NO_LLM +api_keys: + production: sk-... +``` + +When Egressor sees this tag in the request body, it blocks immediately. No prompt, no whitelist bypass. The developer explicitly marked the file. + +```yaml +deny_content_tags: + - "NO_LLM" +``` + +How it works: +- Case-insensitive substring search on the full request body +- The tag can be anywhere in the body, not just at the top (since the file content is embedded in a JSON payload, Egressor can't know where "the top" is) +- You can define multiple tags if your team uses different conventions + +## Layer 4: Content keywords (`deny_content_keywords`) + +**Type:** Interactive -- user is prompted + +For content that might be sensitive but needs human judgment. Unlike the layers above, this one pauses the request and asks the user what to do. + +```yaml +deny_content_keywords: + - "CONFIDENTIAL" + - "INTERNAL ONLY" +``` + +When a keyword is detected, Egressor shows a modal in the desktop UI with four options: + +- **Allow Once** -- forward this request, don't remember the decision +- **Allow Always** -- forward and add the file to `content_whitelist` so it won't be asked again +- **Block Once** -- return 403, don't remember +- **Block Always** -- return 403 and add the file to `content_blacklist` so it's auto-blocked next time + +The whitelist and blacklist are checked before prompting. If a file has already been approved or blocked permanently, the user isn't bothered again. + +```yaml +# These are populated automatically by the UI. +# You can also edit them by hand. +content_whitelist: [] +content_blacklist: [] +``` + +In headless mode (no desktop UI), keyword matches are blocked by default since there's no way to prompt. + +The prompt has a 30-second timeout. If the user doesn't respond, the request is blocked. + +How it works: +- Case-insensitive substring search on the request body +- Files are partitioned into whitelist (auto-allow), blacklist (auto-block), and needs-prompt +- The interceptor goroutine blocks on a channel while waiting for the user's response +- The UI sends the decision back via a Wails-bound method + +## Content type filtering + +Before any of the content-based checks (layers 3 and 4) run, Egressor checks the request's `Content-Type` header. Binary and encoded content is skipped entirely -- Egressor has no OCR or binary decoding, so scanning these would only produce false positives. + +```yaml +intercept: + skip_content_types: + - "image/*" + - "audio/*" + - "video/*" + - "application/octet-stream" + - "application/zip" + - "application/gzip" + - "application/pdf" +``` + +Supports wildcards -- `image/*` matches `image/png`, `image/jpeg`, etc. + +## Bypass + +All four layers respect the policy bypass toggle. When bypassed (via the UI's "Pause Policy" button or the system tray), all checks are skipped and traffic flows through unmodified. This is useful for debugging or when you temporarily need unrestricted access. + +Bypass state is not persisted -- it resets when Egressor restarts. \ No newline at end of file diff --git a/internal/config/config.go b/internal/config/config.go index 8350638..2c20764 100644 --- a/internal/config/config.go +++ b/internal/config/config.go @@ -17,18 +17,20 @@ type Config struct { } type InterceptConfig struct { - CACert string `yaml:"ca_cert"` - CAKey string `yaml:"ca_key"` - LogBody bool `yaml:"log_body"` - MaxBodySize int `yaml:"max_body_size"` + CACert string `yaml:"ca_cert"` + CAKey string `yaml:"ca_key"` + LogBody bool `yaml:"log_body"` + MaxBodySize int `yaml:"max_body_size"` + SkipContentTypes []string `yaml:"skip_content_types"` } type PolicyConfig struct { - DenyFilePatterns []string `yaml:"deny_file_patterns"` - AllowedDirectories []string `yaml:"allowed_directories"` - DenyContentKeywords []string `yaml:"deny_content_keywords"` - ContentKeywordWhitelist []string `yaml:"content_keyword_whitelist"` - ContentKeywordBlacklist []string `yaml:"content_keyword_blacklist"` + DenyFilePatterns []string `yaml:"deny_file_patterns"` + AllowedDirectories []string `yaml:"allowed_directories"` + DenyContentTags []string `yaml:"deny_content_tags"` + DenyContentKeywords []string `yaml:"deny_content_keywords"` + ContentWhitelist []string `yaml:"content_whitelist"` + ContentBlacklist []string `yaml:"content_blacklist"` } type LogConfig struct { @@ -64,6 +66,17 @@ func Load(path string) (*Config, error) { if cfg.Intercept.MaxBodySize == 0 { cfg.Intercept.MaxBodySize = 65536 } + if len(cfg.Intercept.SkipContentTypes) == 0 { + cfg.Intercept.SkipContentTypes = []string{ + "image/*", + "audio/*", + "video/*", + "application/octet-stream", + "application/zip", + "application/gzip", + "application/pdf", + } + } cfg.Intercept.CACert = expandHome(cfg.Intercept.CACert) cfg.Intercept.CAKey = expandHome(cfg.Intercept.CAKey) diff --git a/internal/policy/policy.go b/internal/policy/policy.go index c1bc56b..7ceaea9 100644 --- a/internal/policy/policy.go +++ b/internal/policy/policy.go @@ -156,7 +156,74 @@ func resolvePath(p string) string { return p } -// --- Content keyword methods --- +// --- Content tag methods (hard block) --- + +// GetDenyContentTags returns the current deny content tags. +func (e *Engine) GetDenyContentTags() []string { + e.mu.RLock() + defer e.mu.RUnlock() + out := make([]string, len(e.cfg.DenyContentTags)) + copy(out, e.cfg.DenyContentTags) + return out +} + +// SetDenyContentTags replaces all deny content tags. +func (e *Engine) SetDenyContentTags(tags []string) { + e.mu.Lock() + defer e.mu.Unlock() + e.cfg.DenyContentTags = make([]string, len(tags)) + copy(e.cfg.DenyContentTags, tags) +} + +// AddDenyContentTag appends a single deny content tag. +func (e *Engine) AddDenyContentTag(tag string) { + e.mu.Lock() + defer e.mu.Unlock() + e.cfg.DenyContentTags = append(e.cfg.DenyContentTags, tag) +} + +// RemoveDenyContentTag removes a single deny content tag. +func (e *Engine) RemoveDenyContentTag(tag string) { + e.mu.Lock() + defer e.mu.Unlock() + filtered := e.cfg.DenyContentTags[:0] + for _, t := range e.cfg.DenyContentTags { + if t != tag { + filtered = append(filtered, t) + } + } + e.cfg.DenyContentTags = filtered +} + +// EvaluateContentTags scans the body for deny_content_tags. +// This is a hard block — no user prompt, no whitelist/blacklist. +func (e *Engine) EvaluateContentTags(body string) Decision { + if e.bypassed.Load() { + return Decision{Allowed: true, Reason: "policy bypassed (paused)"} + } + + e.mu.RLock() + tags := make([]string, len(e.cfg.DenyContentTags)) + copy(tags, e.cfg.DenyContentTags) + e.mu.RUnlock() + + if len(tags) == 0 { + return Decision{Allowed: true, Reason: "no content tags configured"} + } + + bodyLower := strings.ToLower(body) + for _, tag := range tags { + if strings.Contains(bodyLower, strings.ToLower(tag)) { + return Decision{ + Allowed: false, + Reason: fmt.Sprintf("body contains denied tag %q", tag), + } + } + } + return Decision{Allowed: true, Reason: "no content tags matched"} +} + +// --- Content keyword methods (interactive) --- // GetDenyContentKeywords returns the current deny content keywords. func (e *Engine) GetDenyContentKeywords() []string { @@ -195,72 +262,72 @@ func (e *Engine) RemoveDenyContentKeyword(keyword string) { e.cfg.DenyContentKeywords = filtered } -// GetContentKeywordWhitelist returns file paths that bypass content keyword checks. -func (e *Engine) GetContentKeywordWhitelist() []string { +// GetContentWhitelist returns file paths that bypass content keyword checks. +func (e *Engine) GetContentWhitelist() []string { e.mu.RLock() defer e.mu.RUnlock() - out := make([]string, len(e.cfg.ContentKeywordWhitelist)) - copy(out, e.cfg.ContentKeywordWhitelist) + out := make([]string, len(e.cfg.ContentWhitelist)) + copy(out, e.cfg.ContentWhitelist) return out } -// AddToContentKeywordWhitelist adds a file path to the content keyword whitelist. -func (e *Engine) AddToContentKeywordWhitelist(path string) { +// AddToContentWhitelist adds a file path to the content keyword whitelist. +func (e *Engine) AddToContentWhitelist(path string) { e.mu.Lock() defer e.mu.Unlock() - for _, p := range e.cfg.ContentKeywordWhitelist { + for _, p := range e.cfg.ContentWhitelist { if p == path { return } } - e.cfg.ContentKeywordWhitelist = append(e.cfg.ContentKeywordWhitelist, path) + e.cfg.ContentWhitelist = append(e.cfg.ContentWhitelist, path) } -// RemoveFromContentKeywordWhitelist removes a file path from the whitelist. -func (e *Engine) RemoveFromContentKeywordWhitelist(path string) { +// RemoveFromContentWhitelist removes a file path from the whitelist. +func (e *Engine) RemoveFromContentWhitelist(path string) { e.mu.Lock() defer e.mu.Unlock() - filtered := e.cfg.ContentKeywordWhitelist[:0] - for _, p := range e.cfg.ContentKeywordWhitelist { + filtered := e.cfg.ContentWhitelist[:0] + for _, p := range e.cfg.ContentWhitelist { if p != path { filtered = append(filtered, p) } } - e.cfg.ContentKeywordWhitelist = filtered + e.cfg.ContentWhitelist = filtered } -// GetContentKeywordBlacklist returns file paths that are always blocked by content keyword checks. -func (e *Engine) GetContentKeywordBlacklist() []string { +// GetContentBlacklist returns file paths that are always blocked by content keyword checks. +func (e *Engine) GetContentBlacklist() []string { e.mu.RLock() defer e.mu.RUnlock() - out := make([]string, len(e.cfg.ContentKeywordBlacklist)) - copy(out, e.cfg.ContentKeywordBlacklist) + out := make([]string, len(e.cfg.ContentBlacklist)) + copy(out, e.cfg.ContentBlacklist) return out } -// AddToContentKeywordBlacklist adds a file path to the content keyword blacklist. -func (e *Engine) AddToContentKeywordBlacklist(path string) { +// AddToContentBlacklist adds a file path to the content keyword blacklist. +func (e *Engine) AddToContentBlacklist(path string) { e.mu.Lock() defer e.mu.Unlock() - for _, p := range e.cfg.ContentKeywordBlacklist { + for _, p := range e.cfg.ContentBlacklist { if p == path { return } } - e.cfg.ContentKeywordBlacklist = append(e.cfg.ContentKeywordBlacklist, path) + e.cfg.ContentBlacklist = append(e.cfg.ContentBlacklist, path) } -// RemoveFromContentKeywordBlacklist removes a file path from the blacklist. -func (e *Engine) RemoveFromContentKeywordBlacklist(path string) { +// RemoveFromContentBlacklist removes a file path from the blacklist. +func (e *Engine) RemoveFromContentBlacklist(path string) { e.mu.Lock() defer e.mu.Unlock() - filtered := e.cfg.ContentKeywordBlacklist[:0] - for _, p := range e.cfg.ContentKeywordBlacklist { + filtered := e.cfg.ContentBlacklist[:0] + for _, p := range e.cfg.ContentBlacklist { if p != path { filtered = append(filtered, p) } } - e.cfg.ContentKeywordBlacklist = filtered + e.cfg.ContentBlacklist = filtered } // ContentKeywordResult holds the outcome of a content keyword evaluation. @@ -282,12 +349,12 @@ func (e *Engine) EvaluateContentKeywords(body string, filePaths []string) Conten e.mu.RLock() keywords := make([]string, len(e.cfg.DenyContentKeywords)) copy(keywords, e.cfg.DenyContentKeywords) - whitelist := make(map[string]bool, len(e.cfg.ContentKeywordWhitelist)) - for _, p := range e.cfg.ContentKeywordWhitelist { + whitelist := make(map[string]bool, len(e.cfg.ContentWhitelist)) + for _, p := range e.cfg.ContentWhitelist { whitelist[p] = true } - blacklist := make(map[string]bool, len(e.cfg.ContentKeywordBlacklist)) - for _, p := range e.cfg.ContentKeywordBlacklist { + blacklist := make(map[string]bool, len(e.cfg.ContentBlacklist)) + for _, p := range e.cfg.ContentBlacklist { blacklist[p] = true } e.mu.RUnlock() diff --git a/internal/policy/policy_test.go b/internal/policy/policy_test.go index 91faeb7..3491158 100644 --- a/internal/policy/policy_test.go +++ b/internal/policy/policy_test.go @@ -157,6 +157,60 @@ func TestEvaluateScope_ParentTraversal(t *testing.T) { } } +func TestEvaluateContentTags_Match(t *testing.T) { + engine := NewEngine(config.PolicyConfig{ + DenyContentTags: []string{"NO_LLM"}, + }) + + decision := engine.EvaluateContentTags("// NO_LLM\npackage secrets\nvar key = \"abc\"") + if decision.Allowed { + t.Error("expected blocked when body contains NO_LLM tag") + } +} + +func TestEvaluateContentTags_CaseInsensitive(t *testing.T) { + engine := NewEngine(config.PolicyConfig{ + DenyContentTags: []string{"NO_LLM"}, + }) + + decision := engine.EvaluateContentTags("# no_llm\nimport os") + if decision.Allowed { + t.Error("expected case-insensitive match for NO_LLM") + } +} + +func TestEvaluateContentTags_NoMatch(t *testing.T) { + engine := NewEngine(config.PolicyConfig{ + DenyContentTags: []string{"NO_LLM"}, + }) + + decision := engine.EvaluateContentTags("package main\nfunc main() {}") + if !decision.Allowed { + t.Error("expected allowed when body does not contain tag") + } +} + +func TestEvaluateContentTags_NoTags(t *testing.T) { + engine := NewEngine(config.PolicyConfig{}) + + decision := engine.EvaluateContentTags("// NO_LLM\npackage main") + if !decision.Allowed { + t.Error("expected allowed when no tags configured") + } +} + +func TestEvaluateContentTags_Bypassed(t *testing.T) { + engine := NewEngine(config.PolicyConfig{ + DenyContentTags: []string{"NO_LLM"}, + }) + engine.SetBypassed(true) + + decision := engine.EvaluateContentTags("// NO_LLM") + if !decision.Allowed { + t.Error("expected allowed when policy bypassed") + } +} + func TestEvaluateContentKeywords_Match(t *testing.T) { engine := NewEngine(config.PolicyConfig{ DenyContentKeywords: []string{"CONFIDENTIAL", "INTERNAL ONLY"}, @@ -219,8 +273,8 @@ func TestEvaluateContentKeywords_Bypassed(t *testing.T) { func TestEvaluateContentKeywords_WhitelistBypass(t *testing.T) { engine := NewEngine(config.PolicyConfig{ - DenyContentKeywords: []string{"CONFIDENTIAL"}, - ContentKeywordWhitelist: []string{"trusted.go"}, + DenyContentKeywords: []string{"CONFIDENTIAL"}, + ContentWhitelist: []string{"trusted.go"}, }) result := engine.EvaluateContentKeywords("CONFIDENTIAL data", []string{"trusted.go", "untrusted.go"}) @@ -237,8 +291,8 @@ func TestEvaluateContentKeywords_WhitelistBypass(t *testing.T) { func TestEvaluateContentKeywords_BlacklistBlock(t *testing.T) { engine := NewEngine(config.PolicyConfig{ - DenyContentKeywords: []string{"CONFIDENTIAL"}, - ContentKeywordBlacklist: []string{"blocked.go"}, + DenyContentKeywords: []string{"CONFIDENTIAL"}, + ContentBlacklist: []string{"blocked.go"}, }) result := engine.EvaluateContentKeywords("CONFIDENTIAL data", []string{"blocked.go", "other.go"}) @@ -255,8 +309,8 @@ func TestEvaluateContentKeywords_BlacklistBlock(t *testing.T) { func TestEvaluateContentKeywords_AllWhitelisted(t *testing.T) { engine := NewEngine(config.PolicyConfig{ - DenyContentKeywords: []string{"CONFIDENTIAL"}, - ContentKeywordWhitelist: []string{"a.go", "b.go"}, + DenyContentKeywords: []string{"CONFIDENTIAL"}, + ContentWhitelist: []string{"a.go", "b.go"}, }) result := engine.EvaluateContentKeywords("CONFIDENTIAL data", []string{"a.go", "b.go"}) @@ -271,35 +325,35 @@ func TestEvaluateContentKeywords_AllWhitelisted(t *testing.T) { } } -func TestContentKeywordWhitelistCRUD(t *testing.T) { +func TestContentWhitelistCRUD(t *testing.T) { engine := NewEngine(config.PolicyConfig{}) - engine.AddToContentKeywordWhitelist("file.go") - engine.AddToContentKeywordWhitelist("file.go") // duplicate - wl := engine.GetContentKeywordWhitelist() + engine.AddToContentWhitelist("file.go") + engine.AddToContentWhitelist("file.go") // duplicate + wl := engine.GetContentWhitelist() if len(wl) != 1 { t.Fatalf("expected 1 entry, got %d", len(wl)) } - engine.RemoveFromContentKeywordWhitelist("file.go") - wl = engine.GetContentKeywordWhitelist() + engine.RemoveFromContentWhitelist("file.go") + wl = engine.GetContentWhitelist() if len(wl) != 0 { t.Fatalf("expected 0 entries, got %d", len(wl)) } } -func TestContentKeywordBlacklistCRUD(t *testing.T) { +func TestContentBlacklistCRUD(t *testing.T) { engine := NewEngine(config.PolicyConfig{}) - engine.AddToContentKeywordBlacklist("file.go") - engine.AddToContentKeywordBlacklist("file.go") // duplicate - bl := engine.GetContentKeywordBlacklist() + engine.AddToContentBlacklist("file.go") + engine.AddToContentBlacklist("file.go") // duplicate + bl := engine.GetContentBlacklist() if len(bl) != 1 { t.Fatalf("expected 1 entry, got %d", len(bl)) } - engine.RemoveFromContentKeywordBlacklist("file.go") - bl = engine.GetContentKeywordBlacklist() + engine.RemoveFromContentBlacklist("file.go") + bl = engine.GetContentBlacklist() if len(bl) != 0 { t.Fatalf("expected 0 entries, got %d", len(bl)) } diff --git a/internal/proxy/intercept.go b/internal/proxy/intercept.go index 60d5a72..0c5893d 100644 --- a/internal/proxy/intercept.go +++ b/internal/proxy/intercept.go @@ -20,20 +20,22 @@ import ( // Interceptor performs TLS interception (MITM) to capture HTTP traffic. type Interceptor struct { - certCache *ca.CertCache - logBody bool - maxBody int - policy *policy.Engine - resolver policy.PromptResolver + certCache *ca.CertCache + logBody bool + maxBody int + policy *policy.Engine + resolver policy.PromptResolver + skipContentTypes []string } // NewInterceptor creates an interceptor backed by the given CA authority. -func NewInterceptor(authority *ca.Authority, logBody bool, maxBody int, pol *policy.Engine) *Interceptor { +func NewInterceptor(authority *ca.Authority, logBody bool, maxBody int, pol *policy.Engine, skipContentTypes []string) *Interceptor { return &Interceptor{ - certCache: ca.NewCertCache(authority), - logBody: logBody, - maxBody: maxBody, - policy: pol, + certCache: ca.NewCertCache(authority), + logBody: logBody, + maxBody: maxBody, + policy: pol, + skipContentTypes: skipContentTypes, } } @@ -96,9 +98,12 @@ func (i *Interceptor) Intercept(clientConn net.Conn, upstreamConn net.Conn, host } bodyStr := reqBodyBuf.String() + // Skip content scanning for binary/non-text content types + skipScan := i.shouldSkipContentScan(req.Header.Get("Content-Type")) + // Extract file references from the request payload files := extract.FilesFromBody(bodyStr) - if len(files) > 0 { + if len(files) > 0 && !skipScan { for _, f := range files { exchange.DetectedFiles = append(exchange.DetectedFiles, audit.FileRef{ Path: f.Path, @@ -169,6 +174,32 @@ func (i *Interceptor) Intercept(clientConn net.Conn, upstreamConn net.Conn, host return nil } + // Check content tags — hard block (e.g. NO_LLM) + decision = i.policy.EvaluateContentTags(bodyStr) + if !decision.Allowed { + slog.Warn("request blocked by content tag", + "session", sess.ID, + "url", exchange.URL, + "reason", decision.Reason, + ) + exchange.StatusCode = 403 + exchange.Blocked = true + exchange.BlockReason = decision.Reason + if i.logBody { + exchange.RequestBody = truncateBody(bodyStr, i.maxBody) + } + resp403 := &http.Response{ + StatusCode: 403, + ProtoMajor: 1, + ProtoMinor: 1, + Header: http.Header{"Content-Type": {"text/plain"}}, + Body: io.NopCloser(strings.NewReader("blocked by egressor: " + decision.Reason)), + } + resp403.Write(clientTLS) + sess.Exchanges = append(sess.Exchanges, exchange) + return nil + } + // Check content keywords — interactive approval kwResult := i.policy.EvaluateContentKeywords(bodyStr, paths) if kwResult.HasMatch { @@ -355,6 +386,32 @@ func stripHopByHop(h http.Header) { } } +// shouldSkipContentScan checks if the request content type matches any skip pattern. +// Supports wildcards like "image/*". +func (i *Interceptor) shouldSkipContentScan(contentType string) bool { + if contentType == "" { + return false + } + ct := strings.ToLower(strings.TrimSpace(contentType)) + // Strip parameters (e.g. "text/plain; charset=utf-8" → "text/plain") + if idx := strings.IndexByte(ct, ';'); idx >= 0 { + ct = strings.TrimSpace(ct[:idx]) + } + for _, skip := range i.skipContentTypes { + skip = strings.ToLower(skip) + if strings.HasSuffix(skip, "/*") { + // Wildcard match: "image/*" matches "image/png" + prefix := skip[:len(skip)-1] // "image/" + if strings.HasPrefix(ct, prefix) { + return true + } + } else if ct == skip { + return true + } + } + return false +} + func truncateBody(body string, max int) string { if len(body) > max { return body[:max] + "[truncated]" diff --git a/internal/proxy/intercept_test.go b/internal/proxy/intercept_test.go index 91934f9..79ce9bd 100644 --- a/internal/proxy/intercept_test.go +++ b/internal/proxy/intercept_test.go @@ -97,6 +97,58 @@ func TestLimitWriter(t *testing.T) { } } +func TestShouldSkipContentScan(t *testing.T) { + i := &Interceptor{ + skipContentTypes: []string{ + "image/*", + "audio/*", + "video/*", + "application/octet-stream", + "application/zip", + "application/pdf", + }, + } + + tests := []struct { + contentType string + skip bool + }{ + // Should skip + {"image/png", true}, + {"image/jpeg", true}, + {"Image/PNG", true}, // case insensitive + {"audio/mpeg", true}, + {"video/mp4", true}, + {"application/octet-stream", true}, + {"application/zip", true}, + {"application/pdf", true}, + {"image/png; charset=utf-8", true}, // with parameters + + // Should NOT skip + {"application/json", false}, + {"text/plain", false}, + {"text/html; charset=utf-8", false}, + {"", false}, // empty + {"application/javascript", false}, + } + + for _, tt := range tests { + t.Run(tt.contentType, func(t *testing.T) { + got := i.shouldSkipContentScan(tt.contentType) + if got != tt.skip { + t.Errorf("shouldSkipContentScan(%q) = %v, want %v", tt.contentType, got, tt.skip) + } + }) + } +} + +func TestShouldSkipContentScan_Empty(t *testing.T) { + i := &Interceptor{skipContentTypes: nil} + if i.shouldSkipContentScan("image/png") { + t.Error("expected no skip when skipContentTypes is nil") + } +} + type byteWriter struct { buf *[]byte } diff --git a/internal/ui/app.go b/internal/ui/app.go index 93fbd97..7004484 100644 --- a/internal/ui/app.go +++ b/internal/ui/app.go @@ -10,6 +10,7 @@ import ( "github.com/ehsaniara/egressor/internal/config" "github.com/ehsaniara/egressor/internal/policy" "github.com/ehsaniara/egressor/internal/proxy" + "github.com/ehsaniara/egressor/internal/tray" wailsRuntime "github.com/wailsapp/wails/v2/pkg/runtime" ) @@ -51,6 +52,24 @@ func (a *App) Startup(ctx context.Context) { if err := a.server.Start(); err != nil { slog.Error("failed to start proxy", "err", err) } + + // Start system tray icon (macOS menu bar) + if tray.Available() { + go tray.Run(tray.Callbacks{ + OnPauseToggle: func(paused bool) { + a.engine.SetBypassed(paused) + if paused { + slog.Info("policy paused via tray") + } else { + slog.Info("policy resumed via tray") + } + }, + OnQuit: func() { + a.server.Stop() + wailsRuntime.Quit(a.ctx) + }, + }) + } } // Shutdown is called by Wails when the app closes. @@ -117,7 +136,25 @@ func (a *App) RemoveAllowedDirectory(dir string) { a.engine.SetAllowedDirectories(filtered) } -// --- Policy management: content keywords --- +// --- Policy management: content tags (hard block) --- + +func (a *App) GetDenyContentTags() []string { + return a.engine.GetDenyContentTags() +} + +func (a *App) SetDenyContentTags(tags []string) { + a.engine.SetDenyContentTags(tags) +} + +func (a *App) AddDenyContentTag(tag string) { + a.engine.AddDenyContentTag(tag) +} + +func (a *App) RemoveDenyContentTag(tag string) { + a.engine.RemoveDenyContentTag(tag) +} + +// --- Policy management: content keywords (interactive) --- func (a *App) GetDenyContentKeywords() []string { return a.engine.GetDenyContentKeywords() @@ -135,20 +172,20 @@ func (a *App) RemoveDenyContentKeyword(keyword string) { a.engine.RemoveDenyContentKeyword(keyword) } -func (a *App) GetContentKeywordWhitelist() []string { - return a.engine.GetContentKeywordWhitelist() +func (a *App) GetContentWhitelist() []string { + return a.engine.GetContentWhitelist() } -func (a *App) RemoveFromContentKeywordWhitelist(path string) { - a.engine.RemoveFromContentKeywordWhitelist(path) +func (a *App) RemoveFromContentWhitelist(path string) { + a.engine.RemoveFromContentWhitelist(path) } -func (a *App) GetContentKeywordBlacklist() []string { - return a.engine.GetContentKeywordBlacklist() +func (a *App) GetContentBlacklist() []string { + return a.engine.GetContentBlacklist() } -func (a *App) RemoveFromContentKeywordBlacklist(path string) { - a.engine.RemoveFromContentKeywordBlacklist(path) +func (a *App) RemoveFromContentBlacklist(path string) { + a.engine.RemoveFromContentBlacklist(path) } // --- Policy bypass --- @@ -222,10 +259,10 @@ func (a *App) ResolveContentPrompt(promptID string, action string) { func (a *App) ResolveContentPromptForFile(action string, filePath string) { switch policy.PromptAction(action) { case policy.PromptAllowAlways: - a.engine.AddToContentKeywordWhitelist(filePath) + a.engine.AddToContentWhitelist(filePath) slog.Info("file added to content keyword whitelist", "path", filePath) case policy.PromptBlockAlways: - a.engine.AddToContentKeywordBlacklist(filePath) + a.engine.AddToContentBlacklist(filePath) slog.Info("file added to content keyword blacklist", "path", filePath) } } @@ -235,9 +272,10 @@ func (a *App) ResolveContentPromptForFile(action string, filePath string) { func (a *App) SaveConfig() error { a.cfg.Policy.DenyFilePatterns = a.engine.GetDenyPatterns() a.cfg.Policy.AllowedDirectories = a.engine.GetAllowedDirectories() + a.cfg.Policy.DenyContentTags = a.engine.GetDenyContentTags() a.cfg.Policy.DenyContentKeywords = a.engine.GetDenyContentKeywords() - a.cfg.Policy.ContentKeywordWhitelist = a.engine.GetContentKeywordWhitelist() - a.cfg.Policy.ContentKeywordBlacklist = a.engine.GetContentKeywordBlacklist() + a.cfg.Policy.ContentWhitelist = a.engine.GetContentWhitelist() + a.cfg.Policy.ContentBlacklist = a.engine.GetContentBlacklist() return config.Save(a.cfgPath, a.cfg) } diff --git a/internal/ui/frontend/src/App.tsx b/internal/ui/frontend/src/App.tsx index 32c8954..414bf17 100644 --- a/internal/ui/frontend/src/App.tsx +++ b/internal/ui/frontend/src/App.tsx @@ -83,6 +83,9 @@ function App() { allowedDirs={policy.allowedDirs} onAddDir={policy.addDirectory} onRemoveDir={policy.removeDirectory} + contentTags={policy.contentTags} + onAddTag={policy.addContentTag} + onRemoveTag={policy.removeContentTag} contentKeywords={policy.contentKeywords} onAddKeyword={policy.addContentKeyword} onRemoveKeyword={policy.removeContentKeyword} diff --git a/internal/ui/frontend/src/components/PolicyEditor.tsx b/internal/ui/frontend/src/components/PolicyEditor.tsx index ac37934..61c2bd1 100644 --- a/internal/ui/frontend/src/components/PolicyEditor.tsx +++ b/internal/ui/frontend/src/components/PolicyEditor.tsx @@ -7,6 +7,9 @@ interface Props { allowedDirs: string[]; onAddDir: (dir: string) => void; onRemoveDir: (dir: string) => void; + contentTags: string[]; + onAddTag: (tag: string) => void; + onRemoveTag: (tag: string) => void; contentKeywords: string[]; onAddKeyword: (keyword: string) => void; onRemoveKeyword: (keyword: string) => void; @@ -20,6 +23,7 @@ interface Props { export function PolicyEditor({ patterns, onAdd, onRemove, allowedDirs, onAddDir, onRemoveDir, + contentTags, onAddTag, onRemoveTag, contentKeywords, onAddKeyword, onRemoveKeyword, whitelist, onRemoveWhitelist, blacklist, onRemoveBlacklist, @@ -27,6 +31,7 @@ export function PolicyEditor({ }: Props) { const [patternInput, setPatternInput] = useState(''); const [dirInput, setDirInput] = useState(''); + const [tagInput, setTagInput] = useState(''); const [keywordInput, setKeywordInput] = useState(''); const [saved, setSaved] = useState(false); @@ -38,6 +43,14 @@ export function PolicyEditor({ } }; + const handleAddTag = () => { + const trimmed = tagInput.trim(); + if (trimmed && !contentTags.includes(trimmed)) { + onAddTag(trimmed); + setTagInput(''); + } + }; + const handleAddDir = () => { const trimmed = dirInput.trim(); if (trimmed && !allowedDirs.includes(trimmed)) { @@ -98,8 +111,19 @@ export function PolicyEditor({ - {/* Content Keywords */} -
+ {/* Content Tags (hard block) */} +
+ + +
+ + {/* Content Keywords (interactive) */} +
([]); const [allowedDirs, setAllowedDirs] = useState([]); + const [contentTags, setContentTags] = useState([]); const [contentKeywords, setContentKeywords] = useState([]); const [whitelist, setWhitelist] = useState([]); const [blacklist, setBlacklist] = useState([]); @@ -30,9 +34,10 @@ export function usePolicy() { useEffect(() => { GetDenyPatterns().then(setPatterns); GetAllowedDirectories().then(setAllowedDirs); + GetDenyContentTags().then(setContentTags); GetDenyContentKeywords().then(setContentKeywords); - GetContentKeywordWhitelist().then(setWhitelist); - GetContentKeywordBlacklist().then(setBlacklist); + GetContentWhitelist().then(setWhitelist); + GetContentBlacklist().then(setBlacklist); IsPolicyBypassed().then(setBypassed); }, []); @@ -63,7 +68,18 @@ export function usePolicy() { setAllowedDirs(await GetAllowedDirectories()); }, []); - // Content keywords + // Content tags (hard block) + const addContentTag = useCallback(async (tag: string) => { + await AddDenyContentTag(tag); + setContentTags(await GetDenyContentTags()); + }, []); + + const removeContentTag = useCallback(async (tag: string) => { + await RemoveDenyContentTag(tag); + setContentTags(await GetDenyContentTags()); + }, []); + + // Content keywords (interactive) const addContentKeyword = useCallback(async (keyword: string) => { await AddDenyContentKeyword(keyword); setContentKeywords(await GetDenyContentKeywords()); @@ -76,14 +92,14 @@ export function usePolicy() { // Whitelist const removeFromWhitelist = useCallback(async (path: string) => { - await RemoveFromContentKeywordWhitelist(path); - setWhitelist(await GetContentKeywordWhitelist()); + await RemoveFromContentWhitelist(path); + setWhitelist(await GetContentWhitelist()); }, []); // Blacklist const removeFromBlacklist = useCallback(async (path: string) => { - await RemoveFromContentKeywordBlacklist(path); - setBlacklist(await GetContentKeywordBlacklist()); + await RemoveFromContentBlacklist(path); + setBlacklist(await GetContentBlacklist()); }, []); // Bypass @@ -100,6 +116,7 @@ export function usePolicy() { return { patterns, bypassed, addPattern, removePattern, updatePatterns, allowedDirs, addDirectory, removeDirectory, + contentTags, addContentTag, removeContentTag, contentKeywords, addContentKeyword, removeContentKeyword, whitelist, removeFromWhitelist, blacklist, removeFromBlacklist, diff --git a/internal/ui/frontend/wailsjs/go/ui/App.ts b/internal/ui/frontend/wailsjs/go/ui/App.ts index d1708da..79b7a55 100644 --- a/internal/ui/frontend/wailsjs/go/ui/App.ts +++ b/internal/ui/frontend/wailsjs/go/ui/App.ts @@ -46,7 +46,20 @@ export function RemoveAllowedDirectory(dir: string): Promise { return window.go.ui.App.RemoveAllowedDirectory(dir); } -// Content keywords +// Content tags (hard block) +export function GetDenyContentTags(): Promise { + return window.go.ui.App.GetDenyContentTags(); +} + +export function AddDenyContentTag(tag: string): Promise { + return window.go.ui.App.AddDenyContentTag(tag); +} + +export function RemoveDenyContentTag(tag: string): Promise { + return window.go.ui.App.RemoveDenyContentTag(tag); +} + +// Content keywords (interactive) export function GetDenyContentKeywords(): Promise { return window.go.ui.App.GetDenyContentKeywords(); } @@ -63,20 +76,20 @@ export function RemoveDenyContentKeyword(keyword: string): Promise { return window.go.ui.App.RemoveDenyContentKeyword(keyword); } -export function GetContentKeywordWhitelist(): Promise { - return window.go.ui.App.GetContentKeywordWhitelist(); +export function GetContentWhitelist(): Promise { + return window.go.ui.App.GetContentWhitelist(); } -export function RemoveFromContentKeywordWhitelist(path: string): Promise { - return window.go.ui.App.RemoveFromContentKeywordWhitelist(path); +export function RemoveFromContentWhitelist(path: string): Promise { + return window.go.ui.App.RemoveFromContentWhitelist(path); } -export function GetContentKeywordBlacklist(): Promise { - return window.go.ui.App.GetContentKeywordBlacklist(); +export function GetContentBlacklist(): Promise { + return window.go.ui.App.GetContentBlacklist(); } -export function RemoveFromContentKeywordBlacklist(path: string): Promise { - return window.go.ui.App.RemoveFromContentKeywordBlacklist(path); +export function RemoveFromContentBlacklist(path: string): Promise { + return window.go.ui.App.RemoveFromContentBlacklist(path); } // Content prompt resolution