AISBench
Popular repositories Loading
-
-
mini-swe-agent
mini-swe-agent PublicForked from SWE-agent/mini-swe-agent
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
Python
-
terminal-bench-2
terminal-bench-2 PublicForked from harbor-framework/terminal-bench-2
Preset all environment in docker images
Shell
Repositories
- benchmark Public
AISBench Benchmark is a model evaluation tool built on OpenCompass, compatible with OpenCompass’s configuration system, dataset structure, and model backend implementation, while extending support for service-based models.
AISBench/benchmark’s past year of commit activity - terminal-bench-2 Public Forked from harbor-framework/terminal-bench-2
Preset all environment in docker images
AISBench/terminal-bench-2’s past year of commit activity - OneIG-Benchmark Public Forked from OneIG-Bench/OneIG-Benchmark
[NeurIPS 2025 DB] OneIG-Bench is a meticulously designed comprehensive benchmark framework for fine-grained evaluation of T2I models across multiple dimensions, including subject-element alignment, text rendering precision, reasoning-generated content, stylization, and diversity.
AISBench/OneIG-Benchmark’s past year of commit activity - mini-swe-agent Public Forked from SWE-agent/mini-swe-agent
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
AISBench/mini-swe-agent’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…