ISCA Archive S4SG 2022
ISCA Archive S4SG 2022

What does it really take to build speech recognition systems for the next billion users?

Pratyush Kumar

Dr. Pratyush Kumar talks about the AI4Bharat project which aims to "bring parity with respect to English in AI technologies for Indian languages with open-source contributions in datasets, models, and applications and by enabling an innovation ecosystem". He talks about the trifecta of data, models, and applications required to realise this mission. He also describes three datasets that include several low-resource Indian languages - Dhwani, Shrutilipi, and Kathbath - the challenges associated with curating these datasets, and experiments with ASR models to achieve state-of-the-art performance. The talk concludes with discussion about open-source contributions to the AI4Bharat platform, including the Chitralekha project, an ASR tool integrated in NPTEL (a leading online learning platform) with support for 9 Indian languages.


Cite as: Kumar, P. (2022) What does it really take to build speech recognition systems for the next billion users? Proc. 1st Workshop on Speech for Social Good (S4SG)

@inproceedings{kumar22b_s4sg,
  author={Pratyush Kumar},
  title={{What does it really take to build speech recognition systems for the next billion users?}},
  year=2022,
  booktitle={Proc. 1st Workshop on Speech for Social Good (S4SG)},
  pages={}
}