AISBench

benchmark Public

AISBench Benchmark is a model evaluation tool built on OpenCompass, compatible with OpenCompass’s configuration system, dataset structure, and model backend implementation, while extending support …

Python 123 46

benchmark-mindie-old Public

plugin for AISBench/benchmark in gitee

Python 1 1

datasets Public

Special dataset generate methods for benchmark

Python 1 2

ci_test Public

test ci

1

mini-swe-agent Public

Forked from SWE-agent/mini-swe-agent

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!

Python

terminal-bench-2 Public

Forked from harbor-framework/terminal-bench-2

Preset all environment in docker images

Shell

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

AISBench

Popular repositories Loading

Repositories

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

People

Top languages

Uh oh!

Most used topics

Uh oh!