TAAFT
Free mode
100% free
Freemium
Free Trial
Deals

Mega ASR

Mega-ASR is a foundation ASR model from xzf-thu for in-the-wild speech recognition under severe acoustic degradation. It is trained around 7 atomic acoustic conditions and 54 compound acoustic scenarios, using 2.6M training samples covering noise, far-field speech, obstruction, echo and reverberation, recording artifacts, electronic distortion, and transmission dropout. The project is built on Qwen3-ASR and uses A2S-SFT plus DG-WGPO reinforcement learning to improve robust transcription, semantic recovery, local keyword reconstruction, and reduce hallucinations, empty outputs, and dropped utterances in challenging environments. The repository says the project will be released under Apache 2.0.
New Multimodal Gen 3
Released: May 19, 2026

Overview

Mega-ASR is an open-source robust automatic speech recognition model built for difficult real-world audio with noise, far-field speech, obstruction, echo, distortion, and transmission dropout.

About Shanghai AI Laboratory

Shanghai AI Laboratory (Shanghai AI Lab) is a research organization advancing next-generation AI and open innovation. Through initiatives like InternAI, it builds and open-sources models and tools, supports researchers and developers worldwide, and pushes practical AI forward from research to real-world applications.

View Company Profile

Tools using Mega ASR

No tools found for this model yet.

Last updated: May 25, 2026
0 AIs selected
Clear selection
#
Name
Task