S K A R T A

The Purposeful Doer

From Sanskrit: one who acts with intention

Documentation • Quickstart • Concepts

Build agents in Python. Ship them on a Rust runtime.

Skarta is a production-grade runtime for multi-agent systems. You write the agents. Skarta handles orchestration, validation, scheduling, streaming, sessions, budgets, telemetry, and security.

pip install skarta

That single install ships a Python SDK and a pre-built runtime binary for macOS (Apple Silicon) and Linux (x86_64, aarch64). No Rust toolchain. No separate server.

Platform support

Windows is not natively supported in this release. The runtime uses Unix domain sockets for local IPC, which require a Unix-family host. Windows users can run Skarta today via WSL2. Native Windows wheels are on the roadmap.

macOS Intel (x86_64) is not built by CI. Apple Silicon Macs only for the macOS wheel today. Intel-Mac users can run the manual release script on an x86_64 macOS host or wait for native support.

Sandboxed code execution is Linux / Docker only for now. The capability sandbox that isolates untrusted or model-generated code in a subprocess (the Sandbox allowlists) cannot yet launch an interpreter such as Python under the native macOS backend. Run such code on a Linux host (native sandbox) or via the sarthiai/skarta Docker image. Everything else, including runtime-enforced tool permissions and the network / filesystem / env allowlists, works on macOS as documented; only subprocess sandboxing of untrusted code is affected.

Client / server architecture (under testing)

A client / server shape is in the works (the skarta-client PyPI wheel and the sarthiai/skarta Docker image), where your Python code connects to a remote Skarta runtime over the network. That path is currently under testing and not officially released for public use. The documentation here covers the supported pip install skarta install only. Stick with it for anything you care about until the client / server path lands properly.

See it work in one minute

A working two-agent typed pipeline that calls a real LLM. One terminal, one Python file, one env var.

1. Install:

pip install skarta
export OPENAI_API_KEY=sk-...

2. Save this as pipeline.py:

import asyncio
from pydantic import BaseModel
from skarta import LLM, Agent, Pipeline


class Topic(BaseModel):
    subject: str

class Brief(BaseModel):
    summary: str

class Headline(BaseModel):
    text: str


llm = LLM(model="gpt-4o-mini")

summarize = Agent(
    llm=llm,
    instructions="Write one sentence on the given subject.",
    input=Topic, output=Brief,
)
headline = Agent(
    llm=llm,
    instructions="Rewrite the brief as a punchy six-word headline.",
    input=Brief, output=Headline,
)


async def main() -> None:
    pipeline = Pipeline(summarize, headline)
    result = await pipeline.execute(Topic(subject="production-grade agent runtimes"))
    print(result.text)


if __name__ == "__main__":
    asyncio.run(main())

3. Run:

python pipeline.py

You will see a real LLM-generated headline.

What just happened: Skarta started, picked up your OpenAI key, saw that summarize produces a Brief and headline consumes a Brief, worked out the order itself, called OpenAI for the first agent, checked the reply matched the shape you asked for, fed it into the second agent, called OpenAI again, checked that reply too, and gave you back a finished Headline object. You wrote no flow chart, no retry loop, no queue, no glue.

Prefer Anthropic? Swap the LLM declaration:

llm = LLM(model="claude-haiku-4-5")  # picks up ANTHROPIC_API_KEY

OpenRouter, Groq, Ollama, or any OpenAI-compatible endpoint: pass base_url and provider="openai_compat" to LLM(...). For richer config (multiple providers, pricing overrides, persistent storage), drop a framework.toml next to your script. See Configuring an LLM endpoint for the full setup.

Want streaming, sessions, tools, or multi-agent collaboration? Three more lines:

from skarta import Tool, Memory, Team, Schedule

@Tool
async def search(args): ...                                  # tool the agent can call

agent = Agent(llm=llm, instructions="...", tools=[search])
agent = Agent(llm=llm, instructions="...", memory=Memory())  # cross-call memory
team  = Team(members=[a, b, c], shape="brainstorm")          # 3-agent dialogue

@Schedule("0 9 * * *")                                       # cron-trigger an agent
async def daily_brief(): await agent.execute("...")

The full surface is in Getting started and per-topic pages in docs/.

From your first agent to a production service

Eleven runnable files, each one a complete example. Each step adds one capability. Stop where Skarta does enough for what you're building. Every example assumes:

pip install skarta
export OPENAI_API_KEY=sk-...

Level 1. One agent that answers

Single-agent, single LLM call.

A customer types a question, your agent answers. The shortest possible Skarta program.

import asyncio
from skarta import LLM, Agent

agent = Agent(
    llm=LLM(model="gpt-4o-mini"),
    instructions="Answer in one short paragraph.",
)

async def main():
    print(await agent.execute("Why are Rust runtimes a good fit for agents?"))

asyncio.run(main())

Skarta started, called OpenAI, returned the answer, and shut down on its own. Nine lines, no server to run.

Level 2. Get the answer in the exact shape your code expects

Structured outputs with typed I/O.

The model returns a Python object with named fields, not a string of text you have to parse and pray over.

import asyncio
from pydantic import BaseModel
from skarta import LLM, Agent

class Question(BaseModel):
    text: str

class Answer(BaseModel):
    reply: str
    confidence: float

agent = Agent(
    llm=LLM(model="gpt-4o-mini"),
    instructions="Reply briefly. Include a confidence score from 0 to 1.",
    input=Question, output=Answer,
)

async def main():
    out = await agent.execute(Question(text="Is the sky blue?"))
    print(out.reply, out.confidence)

asyncio.run(main())

If the model returns junk or skips a field, Skarta raises a clear error before the bad value reaches your code. You can act on out.confidence and out.reply knowing they are exactly the types you asked for.

Level 3. Let the agent call your own functions

Tool calling (also called function calling), with the tool loop on autopilot.

Hand the agent a Python function that hits your API, queries your database, or returns a price from your pricing service. The agent decides when to call it.

import asyncio
from pydantic import BaseModel
from skarta import LLM, Agent, Tool

class WeatherArgs(BaseModel):
    city: str

@Tool
async def get_weather(args: WeatherArgs) -> dict:
    return {"city": args.city, "temp_c": 23, "condition": "sunny"}

agent = Agent(
    llm=LLM(model="gpt-4o-mini"),
    instructions="If asked about weather, call get_weather.",
    tools=[get_weather],
)

async def main():
    print(await agent.execute("What's the weather in Tokyo right now?"))

asyncio.run(main())

Skarta did the back-and-forth: the model asked to call get_weather, Skarta ran your function, fed the result back to the model, and the model wrote its final answer. None of that wiring was yours to write.

Level 4. An agent that remembers what the customer said last time

Conversational memory with pluggable session storage.

The agent picks up where the conversation left off. Same code works for a five-minute chat or one that spans a week.

import asyncio
from skarta import LLM, Agent, Memory

agent = Agent(
    llm=LLM(model="gpt-4o-mini"),
    instructions="You are a helpful note-taker.",
    memory=Memory(),
)

async def main():
    await agent.execute("My favourite colour is teal.", session="user-42")
    print(await agent.execute("What is my favourite colour?", session="user-42"))

asyncio.run(main())

Skarta loaded the earlier turns for session user-42, gave them to the model as context, and saved the new exchange back. Conversations live in a file by default. Switch to Memory(backend="db") and they ride along through restarts and across machines.

Level 5. Save a chat, try a different reply, jump back if you don't like it

Conversation branching and named session checkpoints.

Conversations aren't write-once. Save the chat at any turn, test a tangent, and roll back to the saved turn if the new direction was worse.

import asyncio
from skarta import LLM, Agent, Memory

mem = Memory()
agent = Agent(
    llm=LLM(model="gpt-4o-mini"),
    instructions="You are a trip-planning assistant.",
    memory=mem,
)

async def main():
    await agent.execute("Let's plan a 3-day trip to Kyoto.", session="trip")
    await mem.checkpoint("trip", name="after-outline")

    # Try a tangent on the same thread.
    await agent.execute("Actually, swap day 2 for a day-trip to Osaka.", session="trip")

    # Decide we preferred the original; rewind to the pinned point.
    await mem.restore("trip", name="after-outline")
    print(await agent.execute("OK keep Kyoto only. What's day 1?", session="trip"))

asyncio.run(main())

Skarta saved the conversation at the moment you named, let the agent try the Osaka tangent, then dropped that tangent and reset to the Kyoto plan. Use mem.fork(...) instead of restore to keep both versions alive and switch between them later.

Level 6. Hand the work along a line of agents

Agent pipeline with auto-DAG: Skarta derives the order from your data shapes.

A research question comes in. One agent plans the bullet points, the next writes a draft, the third compresses it to a one-page brief. You write the three agents; Skarta works out the order from your data and runs them back to back.

import asyncio
from pydantic import BaseModel
from skarta import LLM, Agent, Pipeline

class Question(BaseModel):
    text: str
class Outline(BaseModel):
    points: list[str]
class Draft(BaseModel):
    body: str
class OnePager(BaseModel):
    summary: str

llm = LLM(model="gpt-4o-mini")
planner   = Agent(llm=llm, instructions="List three bullet points to research.",
                  input=Question, output=Outline)
writer    = Agent(llm=llm, instructions="Write a short answer from these points.",
                  input=Outline,  output=Draft)
condenser = Agent(llm=llm, instructions="Compress the draft to about 80 words.",
                  input=Draft,    output=OnePager)

async def main():
    out = await Pipeline(planner, writer, condenser).execute(
        Question(text="How does a Rust agent runtime differ from a Python library?")
    )
    print(out.summary)

asyncio.run(main())

No flow charts, no edge lists. Skarta saw that planner produces an Outline, writer expects an Outline and produces a Draft, condenser expects a Draft, worked out the order, and checked every handoff so a malformed reply from one agent can't slip into the next.

Level 7. Pick a different path depending on the result

Auto-orchestrated DAG with conditional branching, retries, parallel fan-out, and racing.

A support ticket arrives. The triage agent decides severity; low-priority tickets get a quick acknowledgement, the urgent ones get a detailed apology and remediation plan. Only the chosen handler runs.

import asyncio
from pydantic import BaseModel
from skarta import LLM, Agent, Process

class Ticket(BaseModel):
    text: str

llm = LLM(model="gpt-4o-mini")
triage = Agent(
    llm=llm,
    instructions="Classify severity. Reply with exactly one word: low or high.",
    input=Ticket, output=str,
)
short_reply = Agent(llm=llm, instructions="Write a one-line acknowledgement.")
long_reply  = Agent(llm=llm, instructions="Write a detailed apology and remediation plan.")

p = Process.branch(
    triage,
    routes={"low": short_reply, "high": long_reply},
)

async def main():
    print(await p.execute(Ticket(text="My laptop is on fire.")))

asyncio.run(main())

Skarta ran the triage agent, sent the ticket to the matching handler, and recorded why the other handler was skipped. Want the agent to try again on flaky calls? Add retry=Retry(max_attempts=3). Want to process a list of tickets at once? Use Process.for_each. Want two agents to race? Process.race. All one line each.

Level 8. Several agents working it out among themselves

Multi-agent collaboration: agent teams and rooms with structured turn-taking.

Sometimes one agent isn't enough. A product manager, an engineer, and a customer-support agent argue over whether to ship a feature. They keep talking until they agree or the turn cap hits.

import asyncio
from skarta import LLM, Agent, Team

llm = LLM(model="gpt-4o-mini")
product = Agent(llm=llm, instructions="You are a product manager.")
eng     = Agent(llm=llm, instructions="You are an engineer focused on feasibility.")
support = Agent(llm=llm, instructions="You are customer support; raise user pain.")

team = Team(
    members=[product, eng, support],
    shape="brainstorm",
    terminate_on={"max_turns": 6},
)

async def main():
    result = await team.execute(
        "Should we ship feature X next quarter? Give a one-paragraph rationale."
    )
    print(result.final)

asyncio.run(main())

Pick shape="decision" and the agents take a structured vote. Pick shape="status-sync" and they go round-robin like a daily stand-up. Promote the team to a Room when you need named guests, silent observers, or one Skarta serving many customers in isolation.

Level 9. Run on a clock, or when something happens elsewhere

Scheduled agents and webhook-triggered agents, with in-runtime HTTP ingress and replay dedupe.

Your morning brief at 9 a.m. every day. Your refund agent the moment Stripe POSTs a charge.dispute. Your incident agent the moment PagerDuty fires. Each is a decorator.

import asyncio
from skarta import LLM, Agent, App, Schedule, Webhook

llm = LLM(model="gpt-4o-mini")
brief = Agent(llm=llm, instructions="Write a 3-bullet morning brief.")

@Schedule("0 9 * * *", tz="UTC")          # every day at 09:00 UTC
async def daily_brief():
    print(await brief.execute("Today's top AI stories?"))

@Webhook("github_issue_opened")           # POST /webhooks/github_issue_opened
async def on_issue(payload):
    print(await brief.execute(f"Summarise this issue: {payload}"))

async def main():
    async with App() as _:
        await asyncio.Event().wait()       # keep the runtime alive

asyncio.run(main())

Skarta read the cron line, opened the HTTP port, dropped any duplicate POSTs Stripe or GitHub retried, and ran your handler. No Flask, no FastAPI, no separate web server, no system cron. Same install.

Level 10. Tell other parts of your stack when something happened

Event-driven agents on a typed outbound bus (pub/sub).

Your refund agent finished. Your audit log, your Slack notifier, and your CRM all need to know. Don't wire each one to each other; publish a named event and let anyone interested subscribe.

import asyncio
from pydantic import BaseModel
from skarta import LLM, Agent, App, Event, Tool, current_context

class RefundArgs(BaseModel):
    order_id: str
    amount_usd: float

@Tool
async def issue_refund(args: RefundArgs) -> dict:
    ctx = current_context()
    await ctx.publish(
        "billing.refund_issued",
        {"order_id": args.order_id, "amount_usd": args.amount_usd},
        idempotency_key=f"refund-{args.order_id}",
    )
    return {"status": "refunded", "order_id": args.order_id}

refund_agent = Agent(
    llm=LLM(model="gpt-4o-mini"),
    instructions="Process refund requests with the issue_refund tool.",
    tools=[issue_refund],
)

# Another part of the system (could be a different process, even a different language) reacts:
@Event("billing.refund_issued")
async def audit(envelope):
    print(f"audit: refund issued for order {envelope['payload']['order_id']}")

async def main():
    async with App() as _:
        await refund_agent.execute("Please refund order #1234 for $19.99.")

asyncio.run(main())

Skarta checked the event name, dropped any duplicate the same refund-1234 key had already fired, and delivered the message to every listener. Metrics on how often each event fires land in your Prometheus dashboard automatically. The audit listener doesn't have to live in this file; it can run in a different process, a different language, or a different deploy and still receive the same event.

Level 11. The full production-grade setup

Guardrails, hard budgets, sandboxing, HITL (human-in-the-loop).

Hard spend caps, human approval on risky tools, an allowlist of hosts the agent may call. The lines you'd otherwise write yourself, or skip and regret. Skarta hands them to you as constructor arguments.

import asyncio
from pydantic import BaseModel
from skarta import LLM, Agent, App, Approval, Budget, Permissions, Tool

class RefundArgs(BaseModel):
    order_id: str
    amount_usd: float

@Tool(requires_approval=True)
async def issue_refund(args: RefundArgs) -> dict:
    return {"status": "refunded", "order_id": args.order_id}

agent = Agent(
    llm=LLM(model="gpt-4o-mini"),
    instructions="Help the user. Use issue_refund when justified.",
    tools=[issue_refund],
    budget=Budget(max_cost_usd=0.50),
)

async def main():
    async with App(
        approval=Approval.slack("https://hooks.slack.com/services/..."),
        permissions=Permissions(
            network=["api.openai.com"],
            env_vars=["OPENAI_API_KEY"],
        ),
    ) as _:
        print(await agent.execute("Please refund order #1234 for $19.99."))

asyncio.run(main())

Skarta cut off model spend at $0.50, paused the refund tool call and posted it to Slack for a human to approve, and refused any outbound HTTP except to api.openai.com, even if the agent was told to call elsewhere. Point SKARTA_DATABASE_URL at Postgres and conversations, spend caps, and the audit log live in your database.

Where each rung is fully documented

Levels 1 to 4 docs/agents.md
Level 5 Sessions in docs/agents.md, docs/conversations.md
Levels 6 and 7 docs/processes.md
Level 8 docs/teams.md, docs/rooms.md
Level 9 docs/triggers.md, docs/schedules.md, docs/webhooks.md
Level 10 docs/outbound-events.md
Level 11 docs/hitl.md, docs/sandbox.md, docs/configuration.md

What's in the box

Everything below ships in the same pip install skarta. Nothing is paid, nothing is an add-on, nothing needs a separate server install.

Six ways to put agents to work: single agent through multi-agent orchestration

Single agent with tool calling and structured outputs. One agent answering a customer, drafting an email, calling your APIs, looking up an order. Agent
Agent pipeline with auto-DAG. A line of agents that hands the work along in order: read the email, classify the intent, draft the reply, polish the tone. You write the four agents; Skarta derives the DAG (the order of work) from the shape of the data and runs them back to back. Pipeline
Auto-orchestrated DAG. A workflow that isn't a straight line. A support ticket gets classified, then refunds go to one agent and complaints to another (conditional branching). A research task splits into ten sub-questions, runs them at the same time (parallel fan-out), then a writer merges the answers. A draft gets rewritten in a loop until the quality score crosses a bar. Two models compete on the same question and you keep whichever answered first (race + cancel losers). Process
Multi-agent collaboration. Several agents talking among themselves. A product manager, an engineer, and a support rep argue over whether to ship a feature. Pick brainstorm for free exploration, decision for a structured vote, status-sync for a stand-up round. The conversation stops on its own when they agree or the turn cap hits. Team
Multi-agent rooms. A named multi-agent conversation with a guest list. Mix a single agent, a whole team, and a multi-step workflow in the same room. Add a chair to keep order, observers who watch but don't speak, and agents that join in the middle. Room
LLM-driven agent orchestration (planner-as-tool). Let an LLM design the workflow itself. You hand it the goal and a menu of agents and tools; it picks who runs in what order; Skarta executes the plan. Useful when the right workflow depends on the request. You can keep separate planners for support, sales, and research running side by side. Orchestrator

Triggers and event-driven integrations

Scheduled agents (cron). An agent that runs every morning at 9 a.m., every fifteen minutes, or on any cron expression. Skarta keeps the clock. Schedule
Webhook-triggered agents with in-runtime HTTP ingress. An agent that wakes up when Stripe charges a card, GitHub opens an issue, or Slack receives a message. Skarta hosts the URL, drops duplicate POSTs (replay dedupe), and answers either right away (202 Accepted) or after the agent finishes. New webhook URLs go live via one API call, no redeploy (dynamic webhooks). Webhook
Outbound events (pub/sub). When one agent finishes and another part of your system needs to know. Your refund agent emits billing.refund_issued; an audit logger written in Rust and a Slack notifier written in Python both pick it up; replays are dropped so the audit row is written once. Event

Memory, skills, and HITL (human-in-the-loop)

Conversational memory with branching and checkpoints. Conversations that survive restarts. Save a chat at the moment the customer said "yes", try a different reply on the side, snap back to the saved moment if you don't like the new one. Keep sessions in a file on disk, in SQLite, in Postgres, or in any pluggable session store you write. Memory
Agent Skills with progressive disclosure. Drop a folder of instructions next to your agent and it picks them up only when the work calls for them. A refund-policy folder, a brand-voice folder, a regulatory-checklist folder. Same format Claude, Cursor and others read; the open Agent Skills standard. Skill
HITL approval gates. About to refund $5,000, sign a contract, or run a destructive database query? Skarta pauses the agent (requires_approval=True), pings a human in Slack or email, and only continues on a thumbs-up. Hand the approval call to your own webhook if you have one. Approval

Production guardrails the runtime enforces for you

Hard budget caps (call / agent / workflow). Cap a single agent's spend at $0.50, cap a full workflow at $50, cap a customer at $5,000 a day. Skarta refuses the next model call the moment a cap is hit. A runaway prompt can't burn your OpenAI bill. Budget
Typed retry policies with backoff. Anthropic returned a 429? OpenAI timed out? Skarta tries again with the wait pattern you choose, gives up cleanly so callers always see the same outcome. Pick which error codes count as retryable. Retry
Tool permissions and per-tool sandboxing, runtime-enforced. An agent that should only call api.stripe.com can't reach anywhere else, even if a prompt injection tells it to. Declare the file paths, hostnames, and environment variables an agent may touch; the runtime blocks the rest. Permissions and Sandbox
Idempotency for replay-safe steps. Stripe replays the same webhook three times after a glitch; your charge agent runs once. Mark the step @idempotent, point Skarta at a database-backed idempotency store, and duplicate work is dropped even if Skarta restarted in between. @idempotent
Lifecycle hooks for observability and policy. Run code before every model call (redact PII from prompts, log to Datadog) or after every tool call (audit, block on a policy). Six hook points, decorator-style. @before_model, @after_tool, and friends

What you won't find in any Python-library agent framework

Rust runtime, single binary, one install. One pip install skarta ships a Rust runtime under your Python code. No second framework, no "make it production-ready" project.
Auto-DAG from data flow. No flow charts to wire, no depends_on fields, no edge declarations. Tell Skarta which agent takes what kind of data and produces what kind, and it derives the DAG itself. Change a data shape and the order updates with no code change.
Cycle detection at submit. A circular workflow gets caught before Skarta makes a single model call. You see exactly which step is looping back to which.
LLM-driven orchestration (planner-as-tool), multiple orchestrators co-existing. Let GPT or Claude design the workflow. Hand the LLM your goal and the list of agents available; it picks who runs in what order; Skarta runs the plan. Keep several different planners (support, sales, research) under one runtime, each with its own goals and tools.
Race + cancel losers in one binding. Send the same hard question to a fast cheap model and a slow smart one. Take whichever finished first. Skarta cancels the loser so you don't pay for the answer you threw away.
Schema validation at five gates, schema-evolution warnings on re-register. Every handoff between agents is type-checked. A model that hallucinates a missing field gets caught before its reply ever reaches the next agent. Re-deploy a worker with a new input shape and Skarta warns the moment an old caller is about to break, before the next workflow runs.
In-runtime HTTP ingress for webhooks, with replay dedupe and dynamic URLs. The webhook URLs that Stripe and GitHub call are hosted inside Skarta itself. No Flask, no FastAPI, no nginx in front. Duplicate POSTs are dropped for you; new URLs go live without a redeploy.
Hot worker reload. Ship a bug fix to one agent and the runtime swaps the worker in place; in-flight workflows keep going.
Observability in the same wheel. Prometheus metrics. OpenTelemetry traces to Datadog / Honeycomb / Grafana. /health and /readyz for Kubernetes liveness probes. All in the same pip install.

What you can build

If a workflow can be described in plain language and split into steps, you can ship it on Skarta. The primitives above (Agent, Pipeline, Process, Team, Room, Orchestrator, Schedule, Webhook, Event, Memory, Skill, Approval) are the LEGO bricks. Below is a sample of what teams are building with them, organised by where the agent earns its keep. It's illustrative, not exhaustive. The ceiling is your imagination.

Customer-facing agents

Agentic customer-support automation with HITL. A triage agent reads each ticket, refunds go to one handler and complaints to another, a draft reply waits for a human's thumbs-up in Slack, and the agent never spends more than a few cents per ticket.
Sales and lead qualification. An agent reads inbound leads from your CRM, scores them against BANT or MEDDIC, drafts a personalised first-touch email, and routes high-value leads to a human SDR with spend capped per lead.
Customer-success copilots with persistent sessions. Conversational memory per customer, full history surviving restarts. The copilot picks up exactly where the conversation ended a week or a month ago.
Streaming chat copilots in your product. Tokens stream to your UI as the model thinks. Conversations survive page refreshes and server restarts.

Internal-team copilots

HR and policy Q&A. Drop your HR handbook in as an Agent Skill (progressive disclosure) and the agent answers employee questions citing the exact policy, with no fine-tuning required.
Onboarding walkthroughs. A copilot that walks new hires through their first-week tasks, calling your provisioning APIs as Tools, pausing for a human's thumbs-up before granting elevated access (Approval).
Internal knowledge bases. Per-team budget caps so one team's experiments can't burn the shared OpenAI bill.

Engineering and DevOps

Code-review agents. GitHub fires the moment a PR opens; the agent reviews the diff, comments inline, flags risk for a human reviewer. Replay-safe so a redelivered GitHub webhook doesn't post duplicate comments (@idempotent).
Incident-response agents. PagerDuty fires; an event-driven agent pulls related metrics from Datadog, summarises the incident, drafts a status-page update, and pings on-call in Slack.
Release-note generators. A nightly cron agent scans merged PRs, drafts release notes grouped by feature, posts them for review.
Migration and refactor agents. A multi-agent team (planner + executor + verifier) proposes database migrations, runs them in a sandbox, rolls back on any test failure.

Data, analytics, and reporting

Document extraction at scale. PDF in, structured JSON out, validated against your Pydantic schema. Parallel fan-out lets you process 500 invoices at once.
Cross-source reporting. A Process pulls from your data warehouse, your CRM, and your billing system in parallel; a writer agent fuses the answers into a one-page weekly brief on a Schedule.
Anomaly explanation. A metrics-watching agent wakes up on a Datadog alert, pulls related logs and traces, writes a one-paragraph hypothesis for the on-call engineer.
Deep-research agents with parallel fan-out and quality loops. One agent breaks a question into sub-questions, ten agents research them in parallel, a writer fuses the findings, a critic iterates the draft until the quality score crosses a bar.

Finance and operations

Invoice processing. An OCR tool extracts line items, an agent matches them to purchase orders, anything over $5,000 goes to a human in Slack (Approval), and the final approval emits an invoice.approved event your ERP listens for.
Reconciliation. A daily agent walks ledger entries against bank statements, flags mismatches, drafts journal corrections for human approval.
Fraud-flagging. Every transaction triggers an event-driven agent that scores it against historic patterns and either lets it through or pauses it for review.
Compliance and audit. Agents check documents against regulations loaded as Skills; every model call lands in the audit log automatically.

Content, marketing, and growth

Editorial pipelines. Brief → outline → draft → SEO check → image-prompt generation → publish. Each step is an Agent; Skarta derives the auto-DAG from your data shapes.
Pricing intelligence. A scheduled agent scrapes competitor pricing, compares to yours, writes a daily Slack summary with recommendations.
Personalised outreach. Per-customer drafts that pull from a Memory of every prior conversation and a Skill folder of your brand-voice rules.
SEO and content audits. Multi-agent teams that crawl, score, and propose changes across thousands of pages with budget caps so the AWS bill doesn't run away.

Platform and infrastructure

Multi-language agent extensions. Your Python agents talk to a Rust agent owned by your platform team. Both deploy independently. Neither restart takes the other down.
Self-improving agent loops. A draft agent and a critic agent in a Process.loop, iterating until the quality score crosses a bar. Autonomous quality control without manual review.
Event-driven workflows across your stack. Agents that wake up when Stripe charges a card, GitHub opens a PR, or Slack receives a message. New webhook URLs created at runtime, no redeploy.

Skarta gives you the engine: scheduling, validation, budgets, sessions, observability. What you put on top is up to you.

Extend it without forking it

Skarta isn't a library you import and patch. Your code runs in its own process and talks to Skarta over a typed connection. That means a teammate can write a new agent in Rust, plug it in tomorrow, and your Python agents call it as if it were local. No fork. No upstream PR. No co-ordinated deploy.

The Python SDK has two layers, both supported, both documented.

The everyday surface (what 95% of users write). One import per idea:

Building blocks	What you reach for them to do
`LLM`, `Agent`, `Tool`, `Step`	A single agent with tool calling and structured outputs (`Agent`), or a deterministic non-LLM step (`Step`) like a database lookup.
`Pipeline`, `Process`	Chain agents in a fixed order with auto-DAG (`Pipeline`), or build an auto-orchestrated workflow with conditional branching, parallel fan-out, race, retries, and loops (`Process`).
`Team`, `Room`	Multi-agent collaboration: free-form discussion (`Team`), or a named multi-agent conversation with guests, observers, and a chair (`Room`).
`Orchestrator`	LLM-driven orchestration (planner-as-tool). Hand the LLM your goal and the agents available; it picks who runs in what order.
`Memory`, `Skill`	Conversational memory with branching and named checkpoints (`Memory`). Agent Skills with progressive disclosure (`Skill`).
`Schedule`, `Webhook`, `Event`	Scheduled agents (cron), webhook-triggered agents (in-runtime HTTP ingress), and event-driven agents (outbound pub/sub bus).
`Approval`	HITL approval gates on any tool: Slack, email, or a custom delivery webhook.
`Budget`, `Retry`	Hard budget caps at call, agent, and workflow level. Typed retry policies with backoff.
`Permissions`, `Sandbox`	Runtime-enforced tool permissions and per-tool sandboxing. Filesystem, network, and env-var allowlists.
`App`	Long-lived service lifecycle. Wrap everything in `async with App() as app: ...`.
`@idempotent` + `App.idempotency_store(...)`	Idempotent steps for replay-safe webhook handlers. Durable dedupe across restarts when backed by a database store.
`before_model`, `after_tool`, ...	Lifecycle hooks for observability and policy (log to Datadog, redact PII, block suspicious calls).

Going deeper. When the everyday surface doesn't cover a case, drop one layer. Eight places you can plug into the runtime from any supported language:

Tools Functions you write in Python or Rust that an LLM can call: look up an order, hit your API, write to your database, send a Slack message.
Workers Custom agents that need more control than the Agent wrapper gives.
Orchestrators Your own LLM planner when the built-in one doesn't fit your domain. Run several different planners side by side under one Skarta.
Interceptors Run code at six points in an agent's life: before and after each model call, before and after each tool call, on context changes, on compaction. Log, redact, or refuse the operation.
Context providers Inject extra messages or system text into the model's context. Think per-customer personas pulled from a database, or session-specific guardrails.
Skills Drop a folder of instructions next to your agent. The agent picks it up only when the work calls for it. Same format Claude, Cursor and others read, the open Agent Skills standard.
Webhooks Skarta hosts the URLs Stripe, GitHub, or your backend POSTs to. It drops duplicate POSTs, sends the request to your handler, and replies right away or after the work finishes, your call. New customer URLs go live without a redeploy.
Session storage Swap the default file or database backend for Redis, S3, or any other store you can write a thin adapter for.

If an agent crashes, only that agent restarts. Ship a fix to one agent while the rest of the service keeps running. Skarta itself stays the same Rust binary; everything above is yours to swap, add, or replace. See docs/concepts.md for the full surface.

Why a Rust runtime under your Python code

Most agent frameworks are Python libraries. They are great for the first demo and painful the day a real customer uses them. Every workflow lives inside one Python process, a single crash takes the whole thing down, a single retry forgets where it was, observability is something you bolt on, spend tracking is whatever the SDK happens to print.

Skarta is a different shape. The runtime is one Rust binary that owns the hard parts: scheduling, validation, budgets, sessions, persistence, telemetry, and access control. Your agent code lives in its own process, in any supported language, and talks to the runtime over a typed wire protocol.

Crash isolation per extension. An agent crashing doesn't take the runtime down or lose your other in-flight workflows.
Hot reload. Ship a fix to one agent without restarting the rest of the service.
High-throughput parts written once, in Rust. Scheduling, validation, and the wire protocol aren't re-implemented per language.

How Skarta compares

Most agent frameworks are Python libraries. You bring the type system, the orchestration, the permissions, and the cost caps. Skarta ships them all.

	Skarta	LangGraph	CrewAI	AutoGen	Pydantic AI	OpenAI Agents
Foundation
Rust runtime, real parallelism	●	○	○	○	○	○
OpenTelemetry (OTLP) traces + Prometheus metrics built in	●	○	○	○	○	○
Structured I/O
Schema-typed worker I/O (schemars / Pydantic)	●	◐	◐	◐	●	◐
Validated at 5 gates (submit, register, dispatch, output, complete)	●	○	○	○	◐	○
Schema-evolution warnings on re-register	●	○	○	○	○	○
Typed protocol errors	●	◐	○	◐	◐	◐
Orchestration
Auto DAG from code, no explicit edges	●	○	○	○	○	○
Auto dependency resolution from data flow	●	○	◐	○	○	○
Cycle detection at submit	●	○	○	○	○	○
LLM emits the DAG (planner-as-tool)	●	○	◐	◐	○	○
Multiple orchestrators co-exist	●	◐	○	◐	○	○
Multi-team: workers as global pool, teams as views	●	○	○	○	○	○
Reliability primitives
Race + cancel losers (one binding)	●	○	○	○	○	○
Loops with previous-iteration binding	●	◐	○	○	○	○
Conditions, retries, timeouts, on_failure, fan-out	●	◐	◐	○	◐	○
Validation policy ladder (permissive override)	●	○	○	○	○	○
Skipped + cancelled nodes surfaced with reason	●	○	○	○	○	○
Webhooks (HTTP ingress)
In-runtime HTTP ingress for inbound webhooks (no FastAPI in front)	●	○	○	○	○	○
Replay dedupe + sync/async response modes built in	●	○	○	○	○	○
Dynamic webhooks: URLs created at runtime via admin RPC, persisted	●	○	○	○	○	○
Cost and permissions
Hard cost caps (call / agent / workflow)	●	○	○	○	○	○
Tool permissions (path / network / env, runtime-enforced)	●	○	○	○	○	◐
State, hooks, skills, tooling
Session branching + named checkpoints	●	◐	○	○	○	○
Lifecycle hooks (model call, context mutation, compaction)	●	◐	○	◐	◐	○
On-demand skills (progressive disclosure)	●	○	○	○	○	○
Dependency viz (Mermaid / DOT / JSON / ASCII)	●	◐	○	○	○	○
Model registry RPC	●	○	○	○	○	○

Legend:

● in the box
◐ via developer code or paid tier
○ not available

_{Frameworks evolve. Open an issue if any cell drifts.}

Production-ready out of the box

Capability	What you get
Persistence	Start in-memory on day one, perfect for development. Flip one env var and every conversation, budget, and audit row gets saved to SQLite or Postgres. Your code doesn't change.
Observability	`/health` and `/readyz` for Kubernetes liveness probes. `/metrics` in Prometheus format. Structured JSON logs. OpenTelemetry (OTLP) traces to Datadog, Honeycomb, or Grafana Tempo.
Permissions	Every agent declares the file paths, hostnames, and environment variables it may touch. The runtime blocks everything else. A prompt-injected agent still can't reach a system you didn't authorise.
Budgets	Tokens and dollar cost tracked per worker, per workflow, per call. Soft ceilings warn; hard ceilings stop the next model call before it goes out the door.
Hot reload	Ship a bug fix to one agent and Skarta swaps the worker in place; in-flight workflows keep running. Schema-evolution warnings fire the moment a new version's input shape changes, before old callers silently break.
Webhooks	In-runtime HTTP ingress hosts the URLs Stripe, GitHub, and your backend POST to. Replay dedupe by a header you name. Sync (`await` the DAG) or async (`202 Accepted`) response modes. New URLs go live without a redeploy.
Sync and async	Every RPC ships on both `skarta.Client` (sync) and `skarta.AsyncClient` (async/await). Pick whichever fits your codebase.

Install

pip install skarta            # runtime + Python SDK in one wheel

Custom OpenAI-compatible endpoints (Ollama, vLLM, Azure OpenAI, etc.) work out of the box via LLM(model=..., api_key=..., base_url=..., provider="openai_compat").

Documentation

Two reading lanes. Pick Default if you are building something with Skarta. Pick Advanced if the wrapper does not give you the control you need or if you are reading SDK internals. docs/README.md is the full index.

Default surface (start here):

Quickstart Three-line hello agent, ten-line typed pipeline, twenty-five-line multi-agent team.
Getting started First real workflow, walked end-to-end: LLM calls, streaming, sessions.
Agents Full Agent reference.
Pipelines and Processes Pipeline / Process, special-shape constructors, branch hooks.
Teams and Rooms Free-form and declarative multi-agent collaboration.
Triggers Schedule, Webhook, Event decorators with one runnable example each.
Idempotency Process.add(idempotent=...) shorthand and App.idempotency_store(...) configurator.
HITL approval Slack / email / custom approval gates for any tool.
Concepts The core ideas plus the Default vs Advanced framing.
Configuration The full env-var + framework.toml reference.
Architecture Runtime + extensions model, the wire protocol.

Advanced / lower-level control:

Python SDK reference Comprehensive reference. The high-level wrapper table is at the top; the lower-level @worker / @tool / Client / Extension / bindings_dsl / FrameworkError reference is fenced under the trailing Advanced section.
Rust SDK reference For native Rust extensions to Skarta.
Every Default page above also carries a trailing ## Advanced (lower-level reference) section that documents the lower-level surface for that topic.

S K A R T A

Designed, developed, and maintained by Chirotpal

Name		Name	Last commit message	Last commit date
Latest commit History 94 Commits
.github/workflows		.github/workflows
crates		crates
docker		docker
docs		docs
fuzz		fuzz
proto		proto
sdks		sdks
tests/harness		tests/harness
.env.example		.env.example
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md
framework.toml.example		framework.toml.example

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

S K A R T A

The Purposeful Doer

See it work in one minute

From your first agent to a production service

Level 1. One agent that answers

Level 2. Get the answer in the exact shape your code expects

Level 3. Let the agent call your own functions

Level 4. An agent that remembers what the customer said last time

Level 5. Save a chat, try a different reply, jump back if you don't like it

Level 6. Hand the work along a line of agents

Level 7. Pick a different path depending on the result

Level 8. Several agents working it out among themselves

Level 9. Run on a clock, or when something happens elsewhere

Level 10. Tell other parts of your stack when something happened

Level 11. The full production-grade setup

Where each rung is fully documented

What's in the box

What you can build

Extend it without forking it

Why a Rust runtime under your Python code

How Skarta compares

Production-ready out of the box

Install

Documentation

S K A R T A

About

Uh oh!

Releases 8

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

S K A R T A

The Purposeful Doer

See it work in one minute

From your first agent to a production service

Level 1. One agent that answers

Level 2. Get the answer in the exact shape your code expects

Level 3. Let the agent call your own functions

Level 4. An agent that remembers what the customer said last time

Level 5. Save a chat, try a different reply, jump back if you don't like it

Level 6. Hand the work along a line of agents

Level 7. Pick a different path depending on the result

Level 8. Several agents working it out among themselves

Level 9. Run on a clock, or when something happens elsewhere

Level 10. Tell other parts of your stack when something happened

Level 11. The full production-grade setup

Where each rung is fully documented

What's in the box

What you can build

Extend it without forking it

Why a Rust runtime under your Python code

How Skarta compares

Production-ready out of the box

Install

Documentation

S K A R T A

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages