Welcome to Speech Data Collection app developed by Sabudh. Our aim is to create a platform where speech data can be collected efficiently and to create a publicly available "speech to text" corpus to fuel research and development of Automatic Speech Recognition (ASR) models for various Indian languages. We will start with Punjabi.
There are few researchers who have worked on Punjabi ASR models but the data is not available on the internet or there is significant resistance in obtaining these data sets, which drive away many potential researchers. We aim to tackle this issue by creating a speech to text corpus for Punjabi language which will be free to use for anybody.
Code for this web app is open source and following are some features we wish to develop. If you wish to contribute to this project (other than contributing your voice):