Actions: sbintuitions/flexeval
Actions
Showing runs from all workflows
653 workflow runs
653 workflow runs
VLLM and HuggingFaceLM that forces a response prefix
Run batch-api tests
#156:
Pull request #288
synchronize
by
junya-takayama
VLLM and HuggingFaceLM that forces a response prefix
Run tests
#1007:
Pull request #288
synchronize
by
junya-takayama
VLLM and HuggingFaceLM that forces a response prefix
Run tests
#1006:
Pull request #288
synchronize
by
junya-takayama
VLLM and HuggingFaceLM that forces a response prefix
Run batch-api tests
#155:
Pull request #288
synchronize
by
junya-takayama
VLLM and HuggingFaceLM that forces a response prefix
Run tests
#1005:
Pull request #288
opened
by
junya-takayama
VLLM and HuggingFaceLM that forces a response prefix
Run batch-api tests
#154:
Pull request #288
opened
by
junya-takayama
ReasoningParser
Run tests
#1004:
Pull request #287
synchronize
by
junya-takayama
ReasoningParser
Run batch-api tests
#153:
Pull request #287
synchronize
by
junya-takayama
ReasoningParser
Run batch-api tests
#152:
Pull request #287
synchronize
by
junya-takayama
ReasoningParser
Run tests
#1003:
Pull request #287
synchronize
by
junya-takayama
ReasoningParser
Run tests
#1002:
Pull request #287
synchronize
by
junya-takayama
ReasoningParser
Run batch-api tests
#151:
Pull request #287
synchronize
by
junya-takayama
ReasoningParser
Run tests
#1001:
Pull request #287
opened
by
junya-takayama
ReasoningParser
Run batch-api tests
#150:
Pull request #287
opened
by
junya-takayama
flexeval_lm
Run tests
#999:
Pull request #286
synchronize
by
junya-takayama
flexeval_lm
Run tests
#998:
Pull request #286
opened
by
junya-takayama
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run tests
#996:
Pull request #285
synchronize
by
junya-takayama
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run batch-api tests
#149:
Pull request #285
synchronize
by
junya-takayama
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run tests
#995:
Pull request #285
synchronize
by
junya-takayama
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run batch-api tests
#148:
Pull request #285
synchronize
by
junya-takayama
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run tests
#994:
Pull request #285
synchronize
by
junya-takayama
reasoning_text and tool_calls in post-hoc LLM judges via flexeval_file.
Run batch-api tests
#147:
Pull request #285
synchronize
by
junya-takayama