-
Notifications
You must be signed in to change notification settings - Fork 46
- #297 · junemoon-happy opened
on May 12, 2026 3
Issues
is:issue state:open
is:issue state:open
Issue creation is restricted in this repository
Search results
[需求] 优化SWE-Bench_Pro数据集资源清理逻辑
content_check_passedissue content check passedissue content check passedenhancementNew feature or requestNew feature or requestStatus: Open.#380 In AISBench/benchmark;[疑问] AisBench和Evalscope在性能测试指标计算上有什么差异吗?相同的服务配置,测试相同的场景,性能结果差异较大。
content_check_passedissue content check passedissue content check passedquestionFurther information is requestedFurther information is requestedStatus: Open.#375 In AISBench/benchmark;[Bug] MultiTurnGenInferencer
infer_every模式未正确累积历史对话content_check_passedissue content check passedissue content check passedquestionFurther information is requestedFurther information is requestedStatus: Open.#369 In AISBench/benchmark;[gsm8k] 模型使用千位分隔法输出答案会存在答案提取错误的情况
content_check_failedissue content check failedissue content check failedStatus: Open.#359 In AISBench/benchmark;[疑问] vbench评分结果如何获取单个视频的数据?
content_check_passedissue content check passedissue content check passedquestionFurther information is requestedFurther information is requestedStatus: Open.#348 In AISBench/benchmark;[疑问] aisbench报错AttributeError: 'PreTrainedConfig' object has no attribute 'max_position_embeddings'
content_check_passedissue content check passedissue content check passedquestionFurther information is requestedFurther information is requestedStatus: Open.#309 In AISBench/benchmark;[Roadmap] AISBench 2026 Q2 Roadmap
content_check_failedissue content check failedissue content check failedStatus: Open.#297 In AISBench/benchmark;[需求] AISBench 压测工具增加像 Evalscope 中的SLA 自动调优设置
content_check_passedissue content check passedissue content check passedenhancementNew feature or requestNew feature or requestStatus: Open.#294 In AISBench/benchmark;【RFC】【性能测评】AISBench性能测评能力增强
content_check_failedissue content check failedissue content check failedStatus: Open.#284 In AISBench/benchmark;[需求] 在配置虚拟数据集的时候能直接传入字符,比如52K,这样能自动解析
content_check_passedissue content check passedissue content check passedenhancementNew feature or requestNew feature or requestStatus: Open.#275 In AISBench/benchmark;[疑问] 如何获取性能测试结果中每一条请求的精确性能指标
content_check_passedissue content check passedissue content check passedquestionFurther information is requestedFurther information is requestedStatus: Open.#262 In AISBench/benchmark;[疑问] 使用aisbench进行lora的性能测试,基础模型为Qwen3-32B, 运行结束后, 显示Failed Requests有很多
content_check_passedissue content check passedissue content check passedquestionFurther information is requestedFurther information is requestedStatus: Open.#256 In AISBench/benchmark;