Home

OperAID Wiki

OperAID is an open-source testbed for evaluating LLM agents as autonomous operators of 5G Core networks deployed on Kubernetes. It implements a closed-loop pipeline:

Fault Injection → Agentic Diagnosis → Remediation → Execution-Based Verification

The framework targets Open5GS by default (via openverso-charts) but is generalized through deployment profiles so any Kubernetes workload can be tested.

Pages

Quick Start — install, env setup, first run
Architecture — components and pipeline overview
Fault Scenarios — S1 NetworkPolicy, S2 ConfigMap, S3 UPF Scale
Deployment Profiles — the JSON contract that drives everything
Diagnosis Engine — multi-turn LLM agent, prompts, retry logic
Diagnostic Tools — built-in kubectl_* tools and custom-tool extension
Safety & Guardrails — command allowlists, dangerous-pattern filters, namespace pinning
Running Experiments — run_experiment.sh flow and CLI flags
Suite Configuration — YAML suites, experiment matrices
Results & Outputs — directory layout, summary.csv, suite_statistics.json
Visualization — paper figures and stats regeneration
Configuration Reference — config.env and environment variables

Key Results (April 2026 — 900 runs)

Metric	Value
Overall LLM success rate	36.0%
Average with tools	70.7%
Average without tools	7.1%
Best small model (3B active params)	Qwen3.5-35b-a3b — 93.3% with tools

Tool access raises average success from 7.1% to 70.7% (+63.6 pp). The hardest scenario is S1 (NetworkPolicy) at 16.0%; the easiest is S3 (UPF scaled to 0) at 49.3%.

Citation

@inproceedings{operaid2026,
  title={OperAID: Benchmarking LLM Agents for Autonomous Kubernetes Fault Remediation},
  author={de Castro, Ariel G. and Vandikas, Konstantinos and Ferlin-Reiter, Simone and Chiesa, Marco and Rothenberg, Christian E.},
  booktitle={IEEE NetSoft Trust 6G-Net Workshop},
  year={2026}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Home

OperAID Wiki

Pages

Key Results (April 2026 — 900 runs)

Citation

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally