One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
-
Updated
May 15, 2026 - Python
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
Versatile Evaluation of Speech and Audio
A standalone tool for evaluating Automatic Speech Recognition (ASR) models, particularly optimized for medical/clinical speech recognition, using Word Error Rate (WER) metric
Professional portfolio for AI Evaluation, Audio Analysis, and UX Research. Specialized in Human-in-the-Loop (HITL) data integrity and design-focused research.
Extract grounded evidence from video files to enable automated review and visual understanding for coding agents.
Add a description, image, and links to the audio-evaluation topic page so that developers can more easily learn about it.
To associate your repository with the audio-evaluation topic, visit your repo's landing page and select "manage topics."