There are more than 8 billion people that speak more than 7,100 languages in the world. AI is not able to converse as fluently or take action with 98% of them—leaving out billions of people. Existing models are far behind popular English models in many language, voice and vision tasks and do not cater to an Indian audience and Indian market- models need to be more adaptable and accessible.
For the reader:
- This is thesis on our understanding of the optimal routes to build better models considering the current and extrapolated future state of Indian language datasets and compute availability
- If any part of the document sparks a though or idea- leave some notes for the community to tinker with.
- We are looking for model builders, users and evaluators to contribute to the projects. Click on the buttons to volunteer or partner in the cause.