Skip to content
arielgoes edited this page May 11, 2026 · 2 revisions

OperAID Wiki

OperAID is an open-source testbed for evaluating LLM agents as autonomous operators of 5G Core networks deployed on Kubernetes. It implements a closed-loop pipeline:

Fault Injection → Agentic Diagnosis → Remediation → Execution-Based Verification

The framework targets Open5GS by default (via openverso-charts) but is generalized through deployment profiles so any Kubernetes workload can be tested.

Pages

Key Results (April 2026 — 900 runs)

Metric Value
Overall LLM success rate 36.0%
Average with tools 70.7%
Average without tools 7.1%
Best small model (3B active params) Qwen3.5-35b-a3b — 93.3% with tools

Tool access raises average success from 7.1% to 70.7% (+63.6 pp). The hardest scenario is S1 (NetworkPolicy) at 16.0%; the easiest is S3 (UPF scaled to 0) at 49.3%.

Citation

@inproceedings{operaid2026,
  title={OperAID: Benchmarking LLM Agents for Autonomous Kubernetes Fault Remediation},
  author={de Castro, Ariel G. and Vandikas, Konstantinos and Ferlin-Reiter, Simone and Chiesa, Marco and Rothenberg, Christian E.},
  booktitle={IEEE NetSoft Trust 6G-Net Workshop},
  year={2026}
}

Clone this wiki locally