Safety and control improvements: visibility, hardening, and trust UX

## Overview

Make the existing safety architecture visible to users and close remaining security gaps. The goal: every defense layer should be something users can see, verify, and trust.

## Tasks

### 1. `obk safety status` command

Add a CLI command that shows the user their current security posture in one view:
- Status of all defense layers (prompt hardening, content boundaries, injection scanning, approval gates, auto-approve rules, tool tiers, bash filtering, sandboxing, scheduled restrictions, audit logging)
- Recent activity summary (last approval, injections blocked, approvals today)
- Recommendations for improvements (e.g. encryption not enabled)

This is the single highest-impact UX feature for differentiation. No other AI assistant has anything like it.

### 2. Approval receipts

After every approved action, show a receipt to the user:
```
✓ Email sent to sarah@example.com
  Subject: "Meeting tomorrow"
  Approved: 14:32 UTC | Audit: obk audit show 847
```
Every action gets a paper trail the user can see and verify.

### 3. Injection attempt alerts (user-visible)

When `ScanForInjection()` detects a pattern, alert the user with context instead of silently filtering:
```
⚠ Prompt injection detected in web_fetch output
  Pattern: "ignore previous instructions"
  Source: https://example.com/page
  Action: Content quarantined
```
Most agents silently filter injections. Showing the user builds trust and educates them about threats.

### 4. Audit log integrity (hash chain)

Make audit logs tamper-evident:
- Add a `prev_hash` field to audit entries
- Each entry includes SHA-256 hash of the previous entry
- `obk audit verify` command walks the chain and reports breaks
- `obk audit list` shows recent tool calls with approval status
- `obk audit show <id>` shows full details of a specific action
- `obk audit stats` shows summary counts

### 5. Path traversal guards on file tools

The learnings tool already has path traversal protection (`b5b49ca`). Apply the same pattern to `file_read` and `dir_explore`:
- `filepath.EvalSymlinks()` before any file operation
- Reject paths containing `..` after cleaning
- Optionally enforce a workspace boundary (configurable root)

### 6. Error message scrubbing

Audit all `fmt.Errorf` calls in `internal/server/`, `provider/`, `oauth/` for information leakage:
- Remove env var names from error messages
- Remove file paths from user-facing HTTP responses
- Keep detailed errors in logs, return generic messages to clients

### 7. SECURITY.md

Create a security policy file:
- Responsible disclosure process with security contact email
- Scope (all code in the repo)
- 90-day disclosure timeline
- Credit given to reporters with consent

### 8. Autonomy dial (control modes)

Let users calibrate agent independence with four modes:

| Mode | Behavior |
|---|---|
| **Observe** | Agent suggests actions but never executes. User confirms everything. |
| **Supervised** | Reads are free. Writes need approval. (Current default.) |
| **Trusted** | Auto-approve low and medium risk. Only high-risk needs approval. |
| **Autonomous** | Everything auto-approved. Post-execution notifications only. |

- Mode is session-scoped (not persistent), matching existing auto-approve philosophy
- Changeable mid-session via command
- Default is `supervised` (no breaking change)
- `autonomous` requires explicit opt-in with one-time risk warning
- Per-tool overrides in config for granular control

## Priority Order

P1: `obk safety status` (highest user impact)
P2: Approval receipts + injection alerts (quick wins, high trust signal)
P3: Audit integrity + path traversal guards (hardening)
P4: Error scrubbing + SECURITY.md (production readiness)
P5: Autonomy dial (requires more design work)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Safety and control improvements: visibility, hardening, and trust UX #124

Overview

Tasks

1. `obk safety status` command

2. Approval receipts

3. Injection attempt alerts (user-visible)

4. Audit log integrity (hash chain)

5. Path traversal guards on file tools

6. Error message scrubbing

7. SECURITY.md

8. Autonomy dial (control modes)

Priority Order

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Mode	Behavior
Observe	Agent suggests actions but never executes. User confirms everything.
Supervised	Reads are free. Writes need approval. (Current default.)
Trusted	Auto-approve low and medium risk. Only high-risk needs approval.
Autonomous	Everything auto-approved. Post-execution notifications only.

Safety and control improvements: visibility, hardening, and trust UX #124

Description

Overview

Tasks

1. obk safety status command

2. Approval receipts

3. Injection attempt alerts (user-visible)

4. Audit log integrity (hash chain)

5. Path traversal guards on file tools

6. Error message scrubbing

7. SECURITY.md

8. Autonomy dial (control modes)

Priority Order

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

1. `obk safety status` command