Mega-ASR

xzf-thu

First foundation ASR built for the real world - 7 atomic acoustic conditions, 54 compound scenarios, 2.6M samples, and up to ~30% gains over SOTA where every other model falls apart. **You'll come back to MEGA-ASR, after the rest fail in the wild. ⭐**

AI 简介

Mega-ASR 是一个面向真实世界场景的鲁棒语音识别（ASR）基础模型，专为复杂声学环境下的高精度转录而设计。其核心特点是系统性覆盖7种原子声学条件（如噪声、远场、遮挡、混响、录音失真等）与54种复合场景，基于260万真实感模拟样本训练，采用A2S-SFT监督微调与DG-WGPO强化学习优化，在强干扰环境下相对SOTA模型最高提升约30%。适用于车载语音、会议记录、户外设备交互、工业现场语音采集等对鲁棒性要求严苛的实际部署场景。

Python

在 GitHub 查看

1.1k

Stars

Forks

Watchers

Issues

Star 增长

今日0

近 7 天0

近 30 天+19

综合评分52.44

默认分支main

Mega-ASR

Star 增长

加入交流群