Skip to content

Popular repositories Loading

  1. WebBench WebBench Public

    📚 Benchmark your browser agent on ~2.5k READ and ACTION based tasks

    96 6

  2. westworld westworld Public

    Python 16 4

  3. browserbench browserbench Public

    The First Benchmark for Browser Infrastructure Stealth

    Python 16 1

  4. sample_sft sample_sft Public

    Sample SFT trajectories produced by action collectors

    1

  5. noodle-flights noodle-flights Public

    Open source flight searching application for benchmarking and training a AI agent's ability to search and select flights. https://halluminate.ai/blog/westworld

    TypeScript 1 1

  6. harbor-cookbook harbor-cookbook Public

    Forked from harbor-framework/harbor-cookbook

    Realistic examples of building evals and optimizing agents with Harbor

    Python

Repositories

Showing 7 of 7 repositories

Top languages

Loading…

Most used topics

Loading…