Campaigns

Campaigns is the open-source Campaign Kernel for AgentOS: a capability-driven control plane where a user expresses an outcome, execution policy, SLA, resource ceiling, and quality requirements, and the system plans, routes, executes, evaluates, repairs, and returns measurable artifacts without exposing provider complexity.

The product philosophy is deliberately not Goal -> Planner -> Agent -> Tool. It is:

Goal -> Campaign Kernel -> Capability -> Implementation -> Evaluation -> Artifact

Providers such as Claude Code, Codex, Ollama, OpenAI, or Anthropic are interchangeable driver details behind capability implementations. Hermes is the primary Phase 0 dogfooding interface, but it is only a thin client/tool consumer over the AgentOS MCP server or REST API. Campaign Kernel remains the system of record.

The design is inspired by multi-agent economies such as Qi et al., "Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions" (2026), https://arxiv.org/pdf/2606.02859, especially the idea that capable agent societies need explicit interaction protocols, resource constraints, specialization, and outcome-oriented coordination rather than a flat task list.

Architectural layer boundary

Campaigns intentionally models the layer above AgentRL:

Layer 1: Runtime
  Question: How does one agent solve a task?
  Output: Trajectory
  Examples: Claude Code, OpenHarness, OpenHands, Codex, OpenCode

Layer 2: Harness Lifecycle
  Question: How do we improve, evaluate, evolve, version, and deploy agents?
  Output: Improved Agent System
  Owner: AgentRL

Layer 3: Swarm Operating System
  Question: How do we continuously execute business objectives through evolving agent organizations?
  Output: Campaign Outcome
  Owner: Campaigns

AgentRL powers Campaigns through deployable pods; Campaigns should not absorb AgentRL's lifecycle responsibilities.

AgentRL answers:

How do we improve, evaluate, evolve, version, and deploy an agent harness?

Campaigns answers:

How do we continuously execute a user's goal through an evolving autonomous organization?

The user does not micromanage every task. The user defines goals, reviews harnesses and approval gates, monitors traces when desired, and receives an ultimate final review packet across the fleet and contract agents.

Dynamic workflow

1. User creates a targeted AgentRL harness
   Example: Market Researcher with RAG, trace, decision-log, evaluation, memory, and approval-gate components.

2. User defines a campaign
   Example: run a marketing campaign using the Market Researcher and RAG Analyst harnesses.

3. Campaigns employs those harnesses as fleet agents
   Each employed agent has a mandate, decision rights, review obligations, and an AgentRL pod declaration.

4. Fleet agents plan and execute bounded work
   The Market Researcher runs RAG-grounded research. The Campaign Manager creates approval gates and operating cadence.

5. Fleet agents contract short-term specialist workers in parallel
   Example contracts: SEO Optimizer, Outreach Worker, Creative Worker, Analytics Worker.

6. Contract agents return deliverables, traces, costs, and evidence
   Employed agents remain accountable for synthesis and decisions.

7. Campaigns synthesizes a final report
   The user receives one final review packet across all fleet and contract agents instead of being forced to micromanage.

8. User monitors traces and performance reviews when desired
   Trace monitors expose decision quality, constraint compliance, contract outcomes, evidence quality, and cost/timeline drift.

9. User performs ultimate review
   Approve, revise, stop, or launch the next iteration.

10. AgentRL consumes traces and review outcomes
    Harnesses can evolve, be versioned, promoted, deployed, or rolled back.

Campaign primitive

campaign:
  objective: Increase recurring revenue by 30% for a local detailing business
  budget:
    dollars: 5000
  timeline:
    days: 90
  metrics:
    - recurring_revenue
    - conversion_rate
  constraints:
    - human approval for spend > $500
  employed_harnesses:
    - agent_name: Market Researcher
      role: market_researcher
      objective: Research demand, competitors, segments, and campaign risks
      components: [rag, trace, decision_log, evaluation]
    - agent_name: RAG Analyst
      role: rag_analyst
      objective: Retrieve and synthesize evidence for claims and assumptions
      components: [rag, trace, evaluation]

A campaign turns a user goal into an accountable operating structure:

Campaign
  -> Workflow DAG
  -> Organization
  -> Team
  -> Employed Fleet Agent
  -> AgentRL Pod Instantiation
  -> Runtime / Harness
       -> Contracted Agents for short-term parallel work
  -> Trace Monitor
  -> Performance Reviews
  -> Ultimate User Review
  -> AgentRL Evolution / Promotion / Rollback

AgentOS architecture repository

This repository is now treated as the Campaign Kernel seed for AgentOS, not as a monolithic agentos repo. The canonical architecture source lives in docs/agentos, with the user-amended v3 architecture package in docs/agentos/v3 and ADRs in docs/adr.

The non-negotiable product philosophy is:

Goal -> Campaign Kernel -> Capability -> Implementation -> Evaluation -> Artifact

Providers are interchangeable device drivers. LangGraph is only the execution runtime. Hermes, dashboards, SDKs, and MCP clients are clients/facades, never the system of record. Phase 0 dogfooding should happen primarily through Hermes consuming the AgentOS MCP server or REST API.

Install

From PyPI after release:

pip install campaigns-os

From source:

python -m venv .venv
source .venv/bin/activate
pip install -e .[dev]

Run from GitHub Container Registry after package publication:

docker pull ghcr.io/junaidahmed361/campaigns:latest
docker run --rm ghcr.io/junaidahmed361/campaigns:latest --version

Quick start

1. Express the outcome, not the provider

campaigns dogfood \
  --goal "Ship an evaluated backend API for campaign receipts" \
  --budget 25 \
  --constraint "immutable receipts" \
  --quality "tests pass" \
  --quality "receipt provenance is replayable"

This accepts only user-facing intent — goal, budget, constraints, quality requirements, and optional SLA — then returns selected capability contracts, measurable artifacts, evaluation gates, an EUV/$ routing objective, and an immutable receipt. Provider complexity stays behind drivers.

2. Dogfood execution with Resource Manager reservations and local CLI auth

Phase 0 can dogfood without Stripe, Apple Pay, invoices, or payment-wallet plumbing. dogfood-exec uses the open-source Resource Manager to reserve dollars from an execution policy, then invokes a local CLI driver through existing Claude/Codex auth configuration. AgentOS does not read or modify auth files and does not receive API keys.

campaigns dogfood-exec \
  --goal "Ship an evaluated backend API for immutable campaign receipts" \
  --budget 25 \
  --reserve 2.50 \
  --constraint "no payment wallet in the open-source core" \
  --constraint "provider choice stays hidden behind CLI drivers" \
  --quality "provider response artifact is captured" \
  --quality "immutable receipt is produced"

For developer diagnostics only, you may force a driver:

campaigns dogfood-exec --driver codex ...
campaigns dogfood-exec --driver claude ...

Normal product use should omit --driver; routing should select a capability implementation without asking the user to think about providers.

3. Expose AgentOS to Hermes through MCP

Hermes is the primary Phase 0 dogfooding interface, but it remains a thin client. The package exposes a minimal MCP-style stdio server scaffold:

agentos-mcp

The server publishes the Phase 0 Hermes tools:

create_campaign, list_campaigns, campaign_status, approve, reject,
list_artifacts, open_artifact, receipt,
execution_policy_get, execution_policy_update

Campaign Kernel owns the state and business logic behind those tools.

Create a review dossier from an example campaign:

campaigns compile examples/revenue-growth.yaml

Or from Python:

from campaigns import CampaignSpec, CampaignCompiler

spec = CampaignSpec.from_dict({
    "objective": "Increase recurring revenue",
    "metrics": ["revenue", "conversions"],
    "employed_harnesses": [{
        "agent_name": "Market Researcher",
        "role": "market_researcher",
        "objective": "Research the market with RAG-grounded evidence",
        "components": ["rag", "trace", "decision_log", "evaluation"],
    }],
})

dossier = CampaignCompiler().compile(spec)
print(dossier.to_dict()["workflow"])

Current primitives

ResourceManager, ExecutionPolicy, and ResourceReservation: open-source resource ceilings/reservations for BYOK, subscription, token API, and local providers without payment-wallet logic.
HermesAdapter and AgentOSMCPServer: thin Phase 0 Hermes dogfooding surface over Campaign Kernel state; Hermes remains a client, not the system of record.
AgentHarnessDefinition: campaign-side reference to a user-created targeted AgentRL harness.
CampaignSpec: user-defined goal, budget, timeline, success metrics, constraints, and employed harnesses.
ArchitectureLayer: explicit Runtime / Harness Lifecycle / Swarm Operating System boundary so Campaigns stays above AgentRL.
WorldModelScenario: simulated future with expected metrics, cost, risk, and rationale before execution.
AgentRLPodInstantiation: portable declaration of an AgentRL pod used by an employed or contract agent.
EmployedAgent: accountable fleet participant with role, mandate, decision rights, contracts, and review obligations.
Contract: outsourced short-term specialist work with success criteria, trace requirements, and a contracted pod.
WorkflowStep: DAG step for harness creation, campaign definition, fleet employment, contract work, synthesis, performance review, ultimate review, and AgentRL evolution.
TraceMonitor: user-monitorable trace surface for fleet performance reviews.
PerformanceReview: scorecard scaffold for reviewing employed agents without micromanagement.
ReviewDossier: final artifact the user reviews before approving execution, accepting outcomes, or triggering another iteration.
CampaignAutorun: simple fit / transform / score / autorun primitive for bounded observe-plan-act-verify-review loops.
AutorunPolicy, GoalCheck, and CampaignIteration: /goal-style loop limits, stop conditions, second-model goal checks, independent final auditor hints, budget pause/resume state, and iteration records.
RetrospectiveFeedback: continual-learning feedback that routes reinforcement to either Campaigns-owned next-iteration strategy or AgentRL-owned agent harness lifecycle updates.
CampaignAutorun.final_review(...): after the user gives final review, a retro agent traverses trace surfaces across all employed agents, attributes root cause, and plans AgentRL self-reinforcement for the relevant harness.

SDK retro example:

from campaigns import CampaignAutorun

runner = CampaignAutorun().fit(campaign)
runner.autorun(max_loops=1)
retro = runner.retro({
    "summary": "The Market Researcher missed competitor pricing evidence.",
    "attention_level": "agentrl",
    "target": "Market Researcher",
    "reinforce": "Require competitor price citations before recommendations.",
})

Boundary

Campaigns does not implement agent runtimes, model training, harness evaluation, harness evolution, or deployment. It records which AgentRL pod should own those lifecycle responsibilities and how the campaign organization composes them. AgentRL does not implement campaign autorun, campaign organizations, contracted-worker queues, performance-review dashboards, or marketing/business workflow policy; those belong in Campaigns.

Claude-style loop autorun

Campaigns includes a simple scikit-learn-style autorun primitive for dynamic campaign workflows:

from campaigns import CampaignAutorun

runner = CampaignAutorun().fit(spec)
dossier = runner.transform()
readiness = runner.score()
result = runner.autorun(max_loops=3)

The autorun loop is intentionally an operating plan, not an agent runtime:

observe -> plan -> act -> verify -> review -> repeat until approval/stop/limit

It can select workflow steps dynamically across loops, preserve trace/review surfaces, and stop at ultimate user review. Runtime execution is delegated to agent systems; harness lifecycle feedback is handed to AgentRL.

CLI:

campaigns autorun examples/revenue-growth.yaml --loops 3

Sister commercial app

The private sister repo is campaigns-app. It provides the commercial interface around the open-source core: hosted UI, billing, user workspaces, approvals, trace/performance dashboards, and integrations.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
.github/workflows		.github/workflows
docs		docs
examples		examples
src/campaigns		src/campaigns
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
ARCHITECTURE.md		ARCHITECTURE.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Campaigns

Architectural layer boundary

Dynamic workflow

Campaign primitive

AgentOS architecture repository

Install

Quick start

1. Express the outcome, not the provider

2. Dogfood execution with Resource Manager reservations and local CLI auth

3. Expose AgentOS to Hermes through MCP

Current primitives

Boundary

Claude-style loop autorun

Sister commercial app

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Campaigns

Architectural layer boundary

Dynamic workflow

Campaign primitive

AgentOS architecture repository

Install

Quick start

1. Express the outcome, not the provider

2. Dogfood execution with Resource Manager reservations and local CLI auth

3. Expose AgentOS to Hermes through MCP

Current primitives

Boundary

Claude-style loop autorun

Sister commercial app

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages