AI4Bharat
Share
Explore
AI4Bharat Admin

icon picker
Meity Timelines

Table
0
Name
Deliverables
Remarks
1
Y1-Q1
Build Shoonya, a unified interface for collecting MT, ASR and NLU datasets for all the 22 languages.
Set up teams of language experts (annotators, translators, transcribers) for all the 22 languages.
Run pilot for on-field 100 hours of voice data collection for Tamil.
Collect a total of 100K English sentences from diverse domains which will subsequently translated to 22 Indian languages.
Collect a total of 100K sentences of everyday conversational content in English which will subsequently be translated to 22 Indian languages
Release 1M mined English-X parallel sentences for 11 languages: Bengali, Gujarati, Hindi, Kannada, Malayalam Marathi, Nepali, Punjabi, Tamil, Telugu, Urdu
Release 500 hours of mined ASR data for 11 languages: Bengali, Gujarati, Hindi, Kannada, Malayalam Marathi, Odia, Punjabi, Tamil, Telugu, Urdu
2
Y1-Q2
12 Phase 1 languages (P1): Assamese, Bengali, Gujarati, Hindi, Kannada, Maithili, Malayalam, Manipuri, Marathi, Sanskrit, Tamil, Urdu.
10 Phase 2 languages (P2): Bodo, Dogri, Kashmiri, Konkani, Nepali, Odia, Punjabi, Santali, Sindhi, Telugu.
Create a MT benchmark containing 10K En-X parallel sentences for P1 languages.
Create an ASR benchmark of 50 hours for P1 languages containing (a) read speech (b) voice commands (c) transcribed extempore conversations (d) transcribed news content (e) transcribed education content (f) transcribed entertainment content.
Create 10 hours of TTS data for P1 languages.
Release synthetic training data containing 100K images each for document layout detection, document text recognition and scene text recognition for all the 22 languages
3
Y1-Q3
Create a MT benchmark containing 10K En-X parallel sentences for P2 languages.
Create an ASR benchmark of 50 hours for P2 languages containing (a) read speech (b) voice commands (c) transcribed extempore conversations (d) transcribed news content (e) transcribed education content (f) transcribed entertainment content.
Create 10 hours of TTS data for P2 languages.
4
Y1-Q4
Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages
Create 200 hours of ASR data for all 22 languages
Create 10 hours of TTS data for for all 22 languages
Create a benchmark for Scene Text Recognition containing 500 images for all 22 languages (13 scripts)
Create a benchmark for document layout recognition containing 500 images for all 22 languages (13 scripts)
Create a benchmark for document OCR containing 500 scanned pages for all 22 languages (13 scripts)
5
Y2-Q1
Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages
Create 200 hours of ASR data for all 22 languages
Create 10 hours of TTS data for for all 22 languages
Create a benchmark for Scene Text Recognition containing additional 500 images for all 22 languages (13 scripts)
Create a benchmark for document layout recognition containing 500 images for all 22 languages (13 scripts)
Create a benchmark for document OCR containing 500 scanned pages for all 22 languages (13 scripts)
OCR (scene + document) goals are met for all 22 languages
6
Y2-Q2
Create 40K En-X parallel sentences (fine-tuning data) for all 22 languages
Create 100 hours of ASR data for all 22 languages
Create 10 hours of TTS data for for all 22 languages
TTS goals are met for all 22 languages
ASR goals are met for P2 languages
7
Y2-Q3
Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages
Create 100 hours of ASR data for P1 languages
8
Y2-Q4
Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages
Create 100 hours of ASR data for P1 languages
9
Y3-Q1
Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages
Create 100 hours of ASR data for P1 languages
10
Y3-Q2
Create 20K En-X parallel sentences (fine-tuning data) for all 22 languages
Create 100 hours of ASR data for all 22 languages
MT goals are met for all 22 languages
11
Y3-Q3
Create 100 hours of ASR data for all 22 languages
Create 5K QA pairs for all 22 languages
Create 5K NER tagged sentences for all 22 languages
Create 5K sentiment labeled sentences for all 22 languages
ASR goals are met for all 22 languages
12
Y3-Q4
Create 5K QA pairs for all 22 languages
Create 5K NER tagged sentences for all 22 languages
Create 5K sentiment labeled sentences for all 22 languages
Create 100K translated QA pairs (noisy training data) for all 22 languages
Create 100K noisy NER sentences (translation + projection) for all 22 languages
Create 100K translated SA sentences for all 22 languages
NER, SA, QA goals are met for all 22 languages
There are no rows in this table

DMU Goals
0
Name
Deliverables
1
Y1-Q1
Develop Shoonya v1, as an open-source tool for collecting MT, ASR and NLU datasets for all the 22 languages.
Set up teams of language experts (annotators, translators, transcribers) for all the 22 languages.
Run pilot for on-field 100 hours of voice data collection for Tamil.
Collect a total of 50K English sentences from diverse domains which will subsequently be translated to 22 Indian languages.
Collect a total of 50K sentences of everyday conversational content in English which will subsequently be translated to 22 Indian languages
Release 1M mined English-X parallel sentences for 11 languages: Bengali, Gujarati, Hindi, Kannada, Malayalam Marathi, Nepali, Punjabi, Tamil, Telugu, Urdu
Release 500 hours of mined ASR data for 11 languages: Bengali, Gujarati, Hindi, Kannada, Malayalam Marathi, Odia, Punjabi, Tamil, Telugu, Urdu
2
Y1-Q2
12 Phase 1 languages (P1): Assamese, Bengali, Gujarati, Hindi, Kannada, Maithili, Malayalam, Manipuri, Marathi, Sanskrit, Tamil, Urdu.
10 Phase 2 languages (P2): Bodo, Dogri, Kashmiri, Konkani, Nepali, Odia, Punjabi, Santali, Sindhi, Telugu.
Develop Shoonya, as an open-source tool for collecting MT, ASR and NLU datasets for all the 22 languages.
Create a MT benchmark containing 10K En-X parallel sentences for P1 languages.
Create an ASR benchmark of 25 hours for P1 languages containing (a) read speech (b) voice commands (c) transcribed extempore conversations (d) transcribed news content (e) transcribed education content (f) transcribed entertainment content.
Create 10 hours of TTS data for P1 languages.
Release synthetic training data containing 100K images each for document layout detection, document text recognition and scene text recognition for all the 22 languages
3
Y1-Q3
Develop Shoonya, as an open-source tool for collecting MT, ASR and NLU datasets for all the 22 languages.
Create a MT benchmark containing 10K En-X parallel sentences for P2 languages.
Create an ASR benchmark of 50 hours for P2 languages containing (a) read speech (b) voice commands (c) transcribed extempore conversations (d) transcribed news content (e) transcribed education content (f) transcribed entertainment content.
Create 10 hours of TTS data for P2 languages.
4
Y1-Q4
Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages
Create 100 hours of ASR data for all 22 languages
Create 10 hours of TTS data for for all 22 languages
Create a benchmark for Scene Text Recognition containing 500 images for all 22 languages (13 scripts)
Create a benchmark for document OCR containing 500 scanned pages for all 22 languages (13 scripts)
5
Y2-Q1
Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages
Create 100 hours of ASR data for all 22 languages
Create 10 hours of TTS data for for all 22 languages
Create a benchmark for Scene Text Recognition containing additional 500 images for all 22 languages (13 scripts)
Create a benchmark for document OCR containing 500 scanned pages for all 22 languages (13 scripts)
6
Y2-Q2
Create 40K En-X parallel sentences (fine-tuning data) for all 22 languages
Create 100 hours of ASR data for all 22 languages
Create 10 hours of TTS data for for all 22 languages
7
Y2-Q3
Create 100 hours of ASR data for all 22 languages
Create 5K QA pairs for all 22 languages
Create 5K NER tagged sentences for all 22 languages
Create 5K sentiment labeled sentences for all 22 languages
8
Y2-Q4
Create 100 hours of ASR data for all 22 languages
Create 5K QA pairs for all 22 languages
Create 5K NER tagged sentences for all 22 languages
Create 5K sentiment labeled sentences for all 22 languages
Create 100K translated QA pairs (noisy training data) for all 22 languages
Create 100K noisy NER sentences (translation + projection) for all 22 languages
Create 100K translated SA sentences for all 22 languages
9
Y3-Q1
Build and release version 1 of ASR, TTS, MT, NLU models for P1 languages
10
Y3-Q2
Build and release version 1 of ASR, TTS, MT, NLU models for P2 languages
11
Y3-Q3
Build and release version 2 of ASR, TTS, MT, NLU models for P1 languages
12
Y3-Q4
Build and release version 2 of ASR, TTS, MT, NLU models for P2 languages
There are no rows in this table

Translation 100 x 11 x 21 x 1000 bitext pairs - ~20M → 100M (Ours) → 230M
ASR 500 hours x 22 - 1000 h → 10000 h
TTS 40 hours x 22 - 700 h → 800 h
OCR 100 x 22 x 1000 images - ? → 2M
NLU 330 x 22 x 1000 annotations - ? - 60M

Share
 
Want to print your doc?
This is not the way.
Try clicking the ⋯ next to your doc name or using a keyboard shortcut (
CtrlP
) instead.