Mega ASR

Mega ASR

Mega-ASR is a foundation ASR model from xzf-thu for in-the-wild speech recognition under severe acoustic degradation. It is trained around 7 atomic acoustic conditions and 54 compound acoustic scenarios, using 2.6M training samples covering noise, far-field speech, obstruction, echo and reverberation, recording artifacts, electronic distortion, and transmission dropout. The project is built on Qwen3-ASR and uses A2S-SFT plus DG-WGPO reinforcement learning to improve robust transcription, semantic recovery, local keyword reconstruction, and reduce hallucinations, empty outputs, and dropped utterances in challenging environments. The repository says the project will be released under Apache 2.0.

Overview

Mega-ASR is an open-source robust automatic speech recognition model built for difficult real-world audio with noise, far-field speech, obstruction, echo, distortion, and transmission dropout.

🗒Transcription 🔊Audio enhancement 🎙️Voice detection

About Shanghai AI Laboratory

Shanghai AI Laboratory (Shanghai AI Lab) is a research organization advancing next-generation AI and open innovation. Through initiatives like InternAI, it builds and open-sources models and tools, supports researchers and developers worldwide, and pushes practical AI forward from research to real-world applications.

Website: shlab.org.cn

View Company Profile

Last updated: July 7, 2026

Go to section

Search

Overview

About Shanghai AI Laboratory

Related Models

Help

People also viewed

Create AI Tools

Mini Tool

Vibe code an AI Tool

Choose listing type: