Mention it in a thread; it plans, runs real tasks against your tools, builds and executes code in a sandbox, remembers how your team works, and shows you every step and every cent, all on infrastructure you control.
Quickstart · Compare · Features · Security · Architecture · Community
Important
Kortny is early and moving fast. Star the repo to follow releases, it genuinely helps the project get found.
Hosted AI coworkers are black boxes: your Slack history and files flow into someone else's cloud, pricing is opaque, and you can't see what the agent actually did. Generic Slack bots answer questions but don't do the work.
Kortny is the alternative you run yourself:
- It finishes what it starts: every request becomes a tracked task with a full log of every step, tool call, approval, and decision.
- It computes instead of guessing: asked for a dashboard, a report, or an analysis, Kortny writes and executes real code in an isolated sandbox, verifies the output, and delivers a file or a live preview link.
- No surprise bills: per-task model, token, and cost accounting in the built-in dashboard. BYO LLM keys (OpenAI, Anthropic, OpenRouter).
- It gets better the longer it's there: workspace memory, episodic recall, and a knowledge graph built from how your team actually works.
- Your bot, your identity: ship it under your own Slack app name and avatar. Kortny runs the brain; you own the face.
- Apache-2.0, Docker Compose, your Postgres: no cloud control plane, no vendor in your data path.
| Kortny | Hosted AI coworkers | Generic Slack bots | |
|---|---|---|---|
| Where your data lives | Your servers | Vendor cloud | Vendor cloud |
| Runs on your infrastructure | ✅ | ❌ | ❌ |
| Executes real code & ships files | ✅ sandboxed | sometimes | ❌ |
| Long-term memory + knowledge graph | ✅ | varies | ❌ |
| Full audit log: every step & cost | ✅ | ❌ opaque | partial |
| Bring-your-own LLM keys (no markup) | ✅ | ❌ | ❌ |
| Your own bot name & branding | ✅ | ❌ | varies |
| License | Apache-2.0 | proprietary | proprietary |
Same idea as a hosted AI coworker, minus the black box: you see every step, pay your model provider directly, and nothing leaves infrastructure you control.
Note
Prerequisites: Docker + Docker Compose, a Slack workspace you can install apps into, and one LLM API key (OpenAI, Anthropic, or OpenRouter). A Composio key enables the 100+ integration catalog.
git clone https://github.com/boffti/kortny
cd kortny
cp .env.example .env # fill in Slack + LLM keys (see below)
docker compose up -d --force-recreateThen put it to work: talk to it in a channel, in a DM, or in the assistant side-panel (the ✨ panel, built for longer back-and-forth with live step-by-step status):
/invite @your-bot-name
@your-bot-name summarize the last 7 days of this channel
@your-bot-name build me a dashboard from the CSV I just uploaded
@your-bot-name turn that into a slide deck and post it here
The operator dashboard at http://localhost:8080 has tasks, traces, costs, memory, schedules, skills, integrations, model config, knowledge graph, and users.
Tip
Booting with an incomplete .env? The dashboard comes up in a guided setup wizard that validates your Slack + LLM keys and writes the rest of your config for you.
Step 1: Create your Slack app (~5 minutes)
- Go to https://api.slack.com/apps → Create New App → From Manifest
- Paste the contents of
manifest.jsonfrom this repo - Name the bot whatever you want, this is your bot, your brand
- Optional: upload an avatar. The Kortny icon ships at
kortny/dashboard/static/assets/kortny_icon.png - Install the app to your workspace
- Generate the App-Level Token for Socket Mode: Basic Information → App-Level Tokens → Generate Token and Scopes, add the
connections:writescope, and copy it (xapp-...). The manifest can't create this for you, it must be generated by hand. - Collect your three required credentials for
.env:- Bot Token (
xoxb-...), OAuth & Permissions (available after install) - App-Level Token (
xapp-...), from step 6 - Signing Secret: Basic Information → App Credentials
- Bot Token (
- Optional, only for "Sign in with Slack" dashboard login (
DASHBOARD_AUTH_MODE=hybridorslack): copy the app's Client ID and Client Secret, and add your dashboard's/auth/slack/callback(e.g.http://localhost:8080/auth/slack/callback) as an OAuth redirect URL. The default password login needs none of this.
If you update an existing Slack app from this repo's manifest, re-apply the manifest in Slack and reinstall the app so new scopes and event subscriptions take effect.
Step 2: Configure .env
SLACK_BOT_TOKEN=xoxb-...
SLACK_APP_TOKEN=xapp-...
SLACK_SIGNING_SECRET=...
LLM_PROVIDER=openai # openai | anthropic | openrouter
LLM_API_KEY=sk-...
LLM_MODEL=gpt-4o
COMPOSIO_API_KEY=...
DASHBOARD_AUTH_MODE=hybrid
DASHBOARD_SLACK_CLIENT_ID=...
DASHBOARD_SLACK_CLIENT_SECRET=...
DASHBOARD_SLACK_REDIRECT_URI=http://localhost:8080/auth/slack/callbackThat's enough for the default stack. Everything else in .env.example is optional with sane local defaults: Postgres credentials, scheduler, Witness, sandbox tuning, observability, Temporal.
Worth knowing:
BRAVE_SEARCH_API_KEYenables the built-in web search tool.ENCRYPTION_KEYis required before saving dashboard-managed secrets.KORTNY_PUBLIC_BASE_URL+KORTNY_PREVIEW_SIGNING_SECRETenable shareable preview links for sandbox-built dashboards and sites.NETLIFY_AUTH_TOKEN/VERCEL_TOKENenable one-approval site deploys.- Change
DASHBOARD_PASSWORDandDASHBOARD_SESSION_SECRETbefore exposing the dashboard beyond localhost (it binds to127.0.0.1by default). - Image understanding routes to a dedicated vision model. Set
LLM_VISION_MODELto a vision-capable model (a current Claude model oropenai/gpt-4o); tasks with image uploads are routed to it automatically based on the uploaded file types. It falls back toLLM_MODELwhen unset, and if neither is vision-capable an image request fails with a clear message instead of guessing.
What docker compose up starts
The default stack is 7 long-running containers + one-shot migrate: every line is something you can see, name, and reason about (no black box):
| Service | What it does | What breaks without it | Profile |
|---|---|---|---|
postgres |
All state: tasks, events, memory, costs, schedules | Everything (the whole system) | default |
migrate |
Runs Alembic migrations, then exits (one-shot) | Schema drifts; app/worker boot against a stale DB | default |
app |
Slack Socket Mode ingress (Bolt) | Slack events never become tasks | default |
worker |
Task executor (the LLM + tool loop) | Tasks queue but never run | default |
ambient |
Supervises three near-idle pollers in one process: scheduler (materializes due schedules), witness (proactive opportunity scanner), consolidator (sleep-time memory/graph consolidation) | Scheduled tasks stop firing, no proactive suggestions, memory stops consolidating into the graph; the request path still works | default |
dashboard |
Operator UI at localhost:8080, tasks, traces, costs, memory, schedules, skills, integrations, model config, knowledge graph, users |
No web console (system keeps running headless) | default |
sandbox-runner |
HTTP service that launches throwaway code-exec containers | Code execution / the sandboxed coding workbench is unavailable | default |
sandbox-docker-proxy |
Restricted Docker-socket proxy the runner reaches the Docker API through | sandbox-runner can't start containers |
default |
phoenix |
Local OTEL trace UI at localhost:6006 |
No local trace UI (tracing still exports if an endpoint is set) | observability |
temporal + temporal-worker |
Experimental durable workflow backend (UI at localhost:8233) |
Falls back to the inline worker (the default) | temporal |
Sandbox code execution is on by default: it's the flagship coworker capability, not a profile add-on. To opt out (e.g. a host that doesn't want code exec), set KORTNY_SANDBOX_EXECUTION_ENABLED=false or drop the two sandbox-* services via a compose.override.yaml.
The three pollers used to be three separate containers; they're now supervised threads inside one ambient process with per-loop crash isolation and restart backoff. Each loop still guards its work with a Postgres advisory lock, so the split entrypoints (python -m kortny.scheduler / kortny.witness / kortny.consolidator) remain the scale-out path: run any of them as its own container to peel a loop back out, no double-work.
Witness is on by default and intentionally bounded: it only auto-starts non-interruptive, read-only proactive tasks (one per tick), requires channel membership before posting, and never DMs unless KORTNY_WITNESS_DELIVER_PRIVATE=true.
|
🛠️ Real work, not just answers Plans, calls your tools, writes and runs code in a sandbox, then ships a file or a live preview link. |
📄 Documents that look designed PDF reports, slide decks, spreadsheets, and charts, not plain-text dumps. |
🧩 Skills, and your own 45+ built-in skills (reports, research, analysis); drop in your own from the dashboard. |
|
🧠 Gets smarter the longer it's there Workspace memory, episodic recall, and a knowledge graph built from how your team works. |
🔌 Plugs into your stack 100+ Composio integrations, your own MCP servers, and built-in web search. |
🪪 Yours, end to end Your Slack name and avatar, your LLM keys, your servers. Every token and cent tracked. |
Full capability matrix
| Capability | Status |
|---|---|
| Slack mention → durable tracked task with full audit log | ✅ Shipped |
| Sandboxed coding workbench: writes & executes code for dashboards, reports, analysis; verifies by running it | ✅ Shipped |
| Shareable preview links for sandbox-built dashboards/sites | ✅ Shipped |
| One-approval deploys to Netlify / Vercel (tokens never enter the sandbox) | ✅ Shipped |
| Per-task cost, token, and model accounting | ✅ Shipped |
| Workspace memory + episodic recall + knowledge graph | ✅ Shipped |
| Scheduled tasks from natural language ("every Monday at 9...") | ✅ Shipped |
| Proactive suggestions from observed activity (Witness) | ✅ Shipped |
| Approval gates: reaction-based confirmation for risky actions | ✅ Shipped |
| 100+ integrations via Composio OAuth (Gmail, GitHub, HubSpot, ...) | ✅ Shipped |
| Documents: PDF reports, slide decks, spreadsheets, charts | ✅ Shipped |
| Agent Skills: 45+ built-in + bring-your-own (dashboard upload/paste), trust-tiered | ✅ Shipped |
| Slack-native output: creates & edits Canvases, bookmarks, pins messages | ✅ Shipped |
| Assistant side-panel: long conversations with live step-by-step status | ✅ Shipped |
| Reads uploaded files (CSV, docs) for analysis | ✅ Shipped |
| Built-in web search (Brave) | ✅ Shipped |
| Multi-user: per-user integrations, usage & schedules; admin/member roles | ✅ Shipped |
| Multi-tier model routing (cheap → high-reasoning) with dashboard model config | ✅ Shipped |
| Response humanizer: Slack-native tone | ✅ Shipped |
Guided first-run setup wizard: validates Slack/LLM keys, renders .env |
✅ Shipped |
| Prompt-injection + tool-trust defenses | ✅ Shipped |
| Per-channel personality profiles (tone, verbosity, proactivity) | ✅ Shipped |
| OTEL tracing (Phoenix local / Langfuse Cloud) | ✅ Shipped |
| Bring-your-own MCP servers (Composio-free integration plane) | ✅ Shipped |
| Temporal durable workflow backend | 🟡 Experimental |
| Network-enabled sandbox profile (pip/npm via egress allowlist) | ⬜ Planned |
Skills are how Kortny knows how to do specialized work. Each is a folder of instructions (and optional scripts) the agent loads on demand, so the right playbook is in context only when the task calls for it.
- 45+ curated skills out of the box: styled PDF reports, slide decks, spreadsheets and charts, cited research briefs, competitive analysis, meeting notes, changelogs, weekly digests, and more.
- Bring your own: add a skill from the dashboard by uploading a folder/zip or pasting Markdown. No redeploy.
- Trust-tiered: every skill carries a trust level (
trusted→community→untrusted→quarantined); only trusted skills may run scripts, and only inside the sandbox. - Scoped enablement: switch a skill on per workspace, channel, or user; the narrowest scope wins.
- Progressive loading: the agent sees a short index, then pulls a skill's full instructions and resources only when it picks one, keeping context lean.
Teach Kortny your report format, your research process, or your tone once, and it reuses them, no code required.
Kortny splits into two halves: the request path (you ask, it answers in seconds to minutes) and the ambient stack (it watches, learns, and acts unprompted over hours to weeks). The request path is the visible product; the ambient stack is what makes it a coworker instead of a chatbot. Four systems form one pipeline: observe → remember → consolidate → act.
Messages in channels Kortny sits in (not addressed to it) land as observation events, gated by per-channel policy (observation on/off, proactivity level, retention). No LLM runs at ingest. Periodically a channel profile is assessed: Kortny's working impression of what the room is about (who's in it, what work recurs, what's struggling), with confidence and evidence.
- Facts are small, explicit, user-confirmed truths ("we use AMA citation style"). They follow propose → Slack confirm → active; nothing automated may ever invalidate one.
- The knowledge graph is the machine-inferred web: entities (people, projects, decisions, commitments) and edges, with three properties that keep it honest:
- Lifecycle: entities are born
candidate, earnactive/confirmedthrough corroboration, and decay tostale/archivedwhen nothing reinforces them. - Bi-temporal truth: contradiction never deletes. The old belief gets
invalid_atstamped and a successor is created, so "what did we believe on June 10" has a real answer and stale facts don't get confidently repeated. - Provenance: every entity carries evidence rows pointing at the exact tasks/episodes/observations it came from. "Why do you think that?" always has an answer.
- Lifecycle: entities are born
- Episodes are per-task experience summaries, the raw feedstock for consolidation, retrieved into context by thread/channel/user proximity.
If experience just piled up, memory would be an append-only junk drawer. The consolidator is Kortny's sleep cycle: it runs when there's enough new material and the workspace is quiet, and turns experience into knowledge through committed-independently passes: promotion (an LLM arbitrates each episode against existing knowledge: ADD / UPDATE / INVALIDATE / NOOP; most things are NOOP; chitchat doesn't become memory), candidate adjudication, duplicate merge, aging, fact reconciliation, retention hygiene, per-channel style cards (so Kortny sounds like the room), and embedding backfill. Every run is visible on the dashboard with per-pass counters and LLM cost.
The only piece that acts unprompted. It reads fresh channel profiles, extracts opportunity candidates ("this desk posts a manual trading summary every morning, I could automate that"), dedupes them into reinforcement, and gates delivery through a deterministic receptivity score built from your accept/dismiss history; staying silent is a scored, logged decision. Delivery is budgeted (digest DMs, ≤1 channel post/week). Accepting a recurring suggestion materializes a real schedule with one confirmation: a "yes" creates a standing behavior, not a one-time answer.
channel chatter ──► Observe (events → profiles)
│
tasks ──► Episodes │
│ │
▼ ▼
Consolidation: promote / merge / invalidate / age
│
▼
Graph + Facts (bi-temporal, evidenced)
│ │
▼ ▼
Context assembly (better answers) Witness (better suggestions)
│ accept
▼
Schedules (standing automations)
│
▼
more tasks → more episodes → loop
Every drive in this loop is visible: the witness scan, memory consolidation, and catalog sync run as pausable System schedules on the dashboard, every graph entity shows its evidence, and every suggestion carries its reasons. The whole point of self-hosting an AI coworker is that none of this is a black box.
Kortny executes model-written code and acts on your tools, so the guardrails are harness-owned, not model-owned. Full details in SECURITY.md.
- Sandboxed execution. Untrusted code never runs in the worker. It runs in dedicated Docker containers: all capabilities dropped, read-only root filesystem, no network, CPU/memory/PID caps, per-task workspaces removed after use. Workers reach Docker only through a restricted socket proxy.
- Human approval. Sandbox sessions, external writes, and deploys require explicit requester approval in Slack (react ✅) before execution.
- Secrets stay out of reach. Integration and deploy tokens live on the trusted worker. Sandboxed code (and the model) never see them.
- Execution guardrails. Max turns, tool-call budgets, and a circuit breaker for repeated failures, enforced by the harness.
- Prompt-injection & tool-trust defenses. Untrusted content (channel text, web results, tool output) can't silently escalate into risky actions. The harness tracks each tool's trust and gates the dangerous combinations, so a poisoned message can't make Kortny exfiltrate secrets or destroy data.
- Everything is auditable. Every tool call, sandbox command, approval, and output preview lands in the task timeline, inspectable in the dashboard.
| Stays on your server | Leaves your server (you choose to whom) |
|---|---|
| Slack event handling (your bot token) | LLM inference → your provider (OpenAI / Anthropic / OpenRouter), your keys |
| All tasks, step logs, approvals, audit trail (Postgres) | External tool actions → Composio (OAuth broker; credentials are brokered, the model never sees raw tokens) |
| Memory, episodes, knowledge graph | Web search → Brave (only if you enable it) |
| Cost & usage accounting | Deploys → Netlify/Vercel (only on explicit approved request) |
| Sandbox execution & artifacts | |
| Dashboard & traces |
There is no Kortny cloud in the data path, no telemetry, no phone-home. If you need a fully self-contained integration plane, register your own MCP servers (KORTNY_MCP_ENABLED, default on) and skip the Composio broker entirely.
The Quickstart stack bind-mounts your source and builds nothing, perfect for hacking, wrong for a server. For a real deployment on a fresh Linux box (DigitalOcean, Hetzner, EC2, any Ubuntu/Debian VPS), one line does it:
curl -fsSL https://raw.githubusercontent.com/boffti/kortny/main/scripts/install.sh | bashIt installs Docker if needed, pulls the published GHCR images, generates the required secrets, and brings up the full stack with the code sandbox, then points you at the setup wizard. Idempotent, so it's safe to re-run.
Note
Why no Railway / Render / Vercel button? Kortny's flagship layer (code execution, document rendering, and skill scripts) runs in throwaway Docker containers spawned through a Docker socket. Those PaaS platforms don't expose one, so a one-click deploy there would ship a Kortny with the sandbox disabled. The installer above targets a real VM, where the sandbox actually works.
Manual install (pin a version, customize the overlay)
export KORTNY_VERSION=v1.2.3
docker compose -f compose.yaml -f compose.prod.yaml pull
docker compose -f compose.yaml -f compose.prod.yaml up -dThe overlay swaps every service to ghcr.io/boffti/kortny (non-root image, no source mount), forces secure cookies, sets restart policies and memory limits, and arms a startup secret guard that refuses to boot with placeholder secrets (ENCRYPTION_KEY, DASHBOARD_SESSION_SECRET, DASHBOARD_PASSWORD). Front the localhost-bound dashboard with a reverse proxy.
Full guide in docs/deployment.md: images, required env, reverse proxy (Caddy + nginx), backups, upgrades, and sandbox container GC.
flowchart LR
subgraph YOUR["Your server, docker compose"]
APP["Slack ingress<br/>(Bolt, Socket Mode)"]
PG[("Postgres<br/>tasks · memory · costs")]
WORKER["Worker<br/>LLM + tool loop"]
AMBIENT["Ambient<br/>scheduler · witness · consolidator"]
DASH["Dashboard :8080<br/>tasks · traces · costs · previews"]
RUNNER["Sandbox runner"]
SBX["Hardened sandbox containers<br/>no network · caps dropped"]
end
SLACK(("Slack")) <--> APP
APP --> PG
WORKER --> PG
AMBIENT --> PG
DASH --> PG
WORKER --> RUNNER --> SBX
WORKER -.->|"inference (your keys)"| LLM(("LLM provider"))
WORKER -.->|"tool actions"| COMPOSIO(("Composio"))
WORKER -.->|"approved deploys"| HOSTS(("Netlify / Vercel"))
How a task flows: a Slack mention (or schedule, or Witness suggestion) becomes a Task row → the worker leases it from the Postgres queue → the agent loop plans, calls tools, pauses for approvals, executes code in the sandbox when the work demands computation → the result posts back to the originating thread, and every step is queryable in the dashboard.
Local development
Kortny uses uv for dependency management.
uv sync
make check # ruff lint + format-check, mypy, pytest
make lint # ruff check
make typecheck # mypy
make test # pytest (unit)Host-side commands need a local POSTGRES_URL:
export POSTGRES_URL=postgresql://kortny:kortny@localhost:5432/kortny
make migrate
uv run python -m kortny.worker --once # process one pending task
uv run python -m kortny.witness --once # one Witness scan/autopilot tickDB-backed integration tests require a dedicated test database (the harness refuses to run against the dev DB):
docker compose exec postgres createdb -U kortny kortny_test
KORTNY_TEST_POSTGRES_URL=postgresql://kortny:kortny@localhost:5432/kortny_test uv run pytestOptional pre-commit hooks: uv run pre-commit install.
Observability (Phoenix / Langfuse)
Local trace UI with one extra container:
make compose-up-observability # Phoenix at http://localhost:6006For Langfuse Cloud (or a separate Langfuse instance), set:
OTEL_EXPORTER_OTLP_ENDPOINT=https://cloud.langfuse.com/api/public/otel/v1/traces
OTEL_EXPORTER_OTLP_HEADERS=Authorization=Basic <base64 pk:sk>,x-langfuse-ingestion-version=4
LANGFUSE_ENABLED=true
LANGFUSE_HOST=https://cloud.langfuse.com
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...Sandboxed code execution: policy details
The worker never mounts the Docker socket. It calls the internal sandbox-runner service over the Compose network, which reaches the Docker API through sandbox-docker-proxy (BUILD/VOLUMES/SYSTEM/SECRETS endpoints disabled).
Two execution modes, both behind requester approval in Slack:
code_exec: one-shot Python snippets for calculations and checks.- Workbench sessions: a persistent hardened container per task that the agent drives with
sandbox_bashand file tools to build apps, run analysis, and produce artifacts. One approval covers the whole session.
Container policy: ghcr.io/astral-sh/uv:python3.11-bookworm-slim, NetworkMode=none, all capabilities dropped, no-new-privileges, read-only root filesystem, CPU/memory/PID limits, per-task workspace volumes removed with the container, idle/TTL reaping, lifecycle + bounded output previews written to the task timeline.
Kill switch: KORTNY_SANDBOX_EXECUTION_ENABLED=false.
Planned, not yet shipped, sequenced toward a stable V1.1:
- Network-enabled sandbox profile:
pip/npminstalls through an egress proxy with a registry allowlist, unlocking full app scaffolding. - Document studio quality loop: an automatic render → critique → revise pass over generated PDFs and decks for pixel-tight, well-balanced multi-page layout.
- Deploy-to-DigitalOcean marketplace image: a true one-click droplet (pre-baked VM with Docker + the sandbox) on top of the one-line installer.
Follow the issues for the live backlog.
Contributions are welcome: native tools, docs, bug reports, and integrations. Start with CONTRIBUTING.md. New tools implement one small Tool interface plus a catalog metadata entry; the tool authoring guide walks through it.
Found a security issue? Please use private vulnerability reporting, not a public issue.
Every team has that coworker who quietly keeps everything running, files the report, remembers the decision from three months ago, ships the dashboard nobody else had time for. Kortny is that coworker, except it never sleeps and it runs on your hardware.
Apache-2.0: permissive, with an explicit patent grant. Build whatever you want with it.