Speech Data Collection

Welcome to Speech Data Collection app developed by Sabudh. Our aim is to create a platform where speech data can be collected efficiently and to create a publicly available "speech to text" corpus to fuel research and development of Automatic Speech Recognition (ASR) models for various Indian languages. We will start with Hindi.

There are few researchers who have worked on Hindi ASR models but the data is not available on the internet or there is significant resistance in obtaining these data sets, which drive away many potential researchers. We aim to tackle this issue by creating a speech to text corpus for Hindi language which will be free to use for anybody.

You can contribute to this project (other than contributing your voice):