An OpenEnv benchmark testing the ability of AI agents to act as Site Reliability Engineers (SREs) by diagnosing and filtering raw production failure logs.
-
Updated
Apr 8, 2026 - Python
An OpenEnv benchmark testing the ability of AI agents to act as Site Reliability Engineers (SREs) by diagnosing and filtering raw production failure logs.
A family of long-horizon software-engineering environments for OpenEnv, adapted from https://github.com/Proximal-Labs/frontier-swe
Deterministic evaluation environment for AI code reviewers covering bugs, security (OWASP), and architecture via FastAPI + OpenEnv.
AI-powered system for low-exposure route optimization using AQI, simulation, and intelligent decision-making
we're addicted to solve some real issues ~ Team ComputeXor
Gymnasium RL environment for AI-powered customer support triage — classify, prioritize, assign, and respond to emails under SLA pressure. Built for the OpenEnv spec.
AI research environment that simulates the end-to-end scientific discovery process, enabling agents to analyze papers, generate hypotheses, design experiments, and validate results collaboratively
CyberRange is an advanced, self-improving simulated environment designed to train and benchmark autonomous security agents in complex enterprise incident response.
📧 Intelligent Agentic Workflow for Autonomous Enterprise Email Triage. Built with OpenEnv, featuring Chain-of-Thought reasoning and Self-Correcting agent logic for high-stakes corporate routing.
High-fidelity Reinforcement Learning environment for smart grids. Features a custom DC Power Flow physics solver and real-world AT&C telemetry to train AI in power distribution and fault isolation.
Multi-zone disaster relief AI env for Meta PyTorch OpenEnv Hackathon. 4-stage pipeline: PyTorch ZoneScorerNet -> Triage -> Planner -> Action Agent. False SOS detection, cascading failures, airlift precision.
An OpenEnv-compliant reinforcement learning environment designed to train and evaluate AI agents on real-world SQL debugging, performance tuning, and schema design.
An OpenEnv RL environment where an LLM agent plays the buyer and negotiates against an LLM-powered seller over real marketplace listings.
RunbookOps-caseop: Deterministic OpenEnv environment for SaaS incident triage, runbook-driven resolution, and agent evaluation.
Universal evaluation layer for OpenEnv agentic RL environments. Measures what an agent learned - not just how much reward it accumulated.
A reinforcement learning agent that learns to intelligently shape electricity demand, reducing peak loads and optimizing energy consumption in real-time.
Execution-grounded SQL optimization OpenEnv. Agents rewrite slow SQL and get rewarded using real DuckDB timing + result correctness across 5 anti-pattern tasks.
OpenEnv Hackathon SF
Add a description, image, and links to the openenv topic page so that developers can more easily learn about it.
To associate your repository with the openenv topic, visit your repo's landing page and select "manage topics."