AI4Bharat

Explore

AI4Bharat

AI4Bharat Admin

Meity Timelines

Table

Table

Name

Deliverables

Remarks

Y1-Q1

Build Shoonya, a unified interface for collecting MT, ASR and NLU datasets for all the 22 languages.

Set up teams of language experts (annotators, translators, transcribers) for all the 22 languages.

Run pilot for on-field 100 hours of voice data collection for Tamil.

Collect a total of 100K English sentences from diverse domains which will subsequently translated to 22 Indian languages.

Collect a total of 100K sentences of everyday conversational content in English which will subsequently be translated to 22 Indian languages

Release 1M mined English-X parallel sentences for 11 languages: Bengali, Gujarati, Hindi, Kannada, Malayalam Marathi, Nepali, Punjabi, Tamil, Telugu, Urdu

Release 500 hours of mined ASR data for 11 languages: Bengali, Gujarati, Hindi, Kannada, Malayalam Marathi, Odia, Punjabi, Tamil, Telugu, Urdu

Y1-Q2

12 Phase 1 languages (P1): Assamese, Bengali, Gujarati, Hindi, Kannada, Maithili, Malayalam, Manipuri, Marathi, Sanskrit, Tamil, Urdu.

10 Phase 2 languages (P2): Bodo, Dogri, Kashmiri, Konkani, Nepali, Odia, Punjabi, Santali, Sindhi, Telugu.

Create a MT benchmark containing 10K En-X parallel sentences for P1 languages.

Create an ASR benchmark of 50 hours for P1 languages containing (a) read speech (b) voice commands (c) transcribed extempore conversations (d) transcribed news content (e) transcribed education content (f) transcribed entertainment content.

Create 10 hours of TTS data for P1 languages.

Release synthetic training data containing 100K images each for document layout detection, document text recognition and scene text recognition for all the 22 languages

Y1-Q3

Create a MT benchmark containing 10K En-X parallel sentences for P2 languages.

Create an ASR benchmark of 50 hours for P2 languages containing (a) read speech (b) voice commands (c) transcribed extempore conversations (d) transcribed news content (e) transcribed education content (f) transcribed entertainment content.

Create 10 hours of TTS data for P2 languages.

Y1-Q4

Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages

Create 200 hours of ASR data for all 22 languages

Create 10 hours of TTS data for for all 22 languages

Create a benchmark for Scene Text Recognition containing 500 images for all 22 languages (13 scripts)

Create a benchmark for document layout recognition containing 500 images for all 22 languages (13 scripts)

Create a benchmark for document OCR containing 500 scanned pages for all 22 languages (13 scripts)

Y2-Q1

Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages

Create 200 hours of ASR data for all 22 languages

Create 10 hours of TTS data for for all 22 languages

Create a benchmark for Scene Text Recognition containing additional 500 images for all 22 languages (13 scripts)

Create a benchmark for document layout recognition containing 500 images for all 22 languages (13 scripts)

Create a benchmark for document OCR containing 500 scanned pages for all 22 languages (13 scripts)

OCR (scene + document) goals are met for all 22 languages

Y2-Q2

Create 40K En-X parallel sentences (fine-tuning data) for all 22 languages

Create 100 hours of ASR data for all 22 languages

Create 10 hours of TTS data for for all 22 languages

TTS goals are met for all 22 languages

ASR goals are met for P2 languages

Y2-Q3

Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages

Create 100 hours of ASR data for P1 languages

Y2-Q4

Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages

Create 100 hours of ASR data for P1 languages

Y3-Q1

Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages

Create 100 hours of ASR data for P1 languages

Y3-Q2

Create 20K En-X parallel sentences (fine-tuning data) for all 22 languages

Create 100 hours of ASR data for all 22 languages

MT goals are met for all 22 languages

Y3-Q3

Create 100 hours of ASR data for all 22 languages

Create 5K QA pairs for all 22 languages

Create 5K NER tagged sentences for all 22 languages

Create 5K sentiment labeled sentences for all 22 languages

ASR goals are met for all 22 languages

Y3-Q4

Create 5K QA pairs for all 22 languages

Create 5K NER tagged sentences for all 22 languages

Create 5K sentiment labeled sentences for all 22 languages

Create 100K translated QA pairs (noisy training data) for all 22 languages

Create 100K noisy NER sentences (translation + projection) for all 22 languages

Create 100K translated SA sentences for all 22 languages

NER, SA, QA goals are met for all 22 languages

There are no rows in this table

⁠

DMU Goals

DMU Goals

Name

Deliverables

Y1-Q1

Develop Shoonya v1, as an open-source tool for collecting MT, ASR and NLU datasets for all the 22 languages.

Set up teams of language experts (annotators, translators, transcribers) for all the 22 languages.

Run pilot for on-field 100 hours of voice data collection for Tamil.

Collect a total of 50K English sentences from diverse domains which will subsequently be translated to 22 Indian languages.

Collect a total of 50K sentences of everyday conversational content in English which will subsequently be translated to 22 Indian languages

Release 1M mined English-X parallel sentences for 11 languages: Bengali, Gujarati, Hindi, Kannada, Malayalam Marathi, Nepali, Punjabi, Tamil, Telugu, Urdu

Release 500 hours of mined ASR data for 11 languages: Bengali, Gujarati, Hindi, Kannada, Malayalam Marathi, Odia, Punjabi, Tamil, Telugu, Urdu

Y1-Q2

12 Phase 1 languages (P1): Assamese, Bengali, Gujarati, Hindi, Kannada, Maithili, Malayalam, Manipuri, Marathi, Sanskrit, Tamil, Urdu.

10 Phase 2 languages (P2): Bodo, Dogri, Kashmiri, Konkani, Nepali, Odia, Punjabi, Santali, Sindhi, Telugu.

Develop Shoonya, as an open-source tool for collecting MT, ASR and NLU datasets for all the 22 languages.

Create a MT benchmark containing 10K En-X parallel sentences for P1 languages.

Create an ASR benchmark of 25 hours for P1 languages containing (a) read speech (b) voice commands (c) transcribed extempore conversations (d) transcribed news content (e) transcribed education content (f) transcribed entertainment content.

Create 10 hours of TTS data for P1 languages.

Release synthetic training data containing 100K images each for document layout detection, document text recognition and scene text recognition for all the 22 languages

Y1-Q3

Develop Shoonya, as an open-source tool for collecting MT, ASR and NLU datasets for all the 22 languages.

Create a MT benchmark containing 10K En-X parallel sentences for P2 languages.

Create 10 hours of TTS data for P2 languages.

Y1-Q4

Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages

Create 100 hours of ASR data for all 22 languages

Create 10 hours of TTS data for for all 22 languages

Create a benchmark for Scene Text Recognition containing 500 images for all 22 languages (13 scripts)

Create a benchmark for document OCR containing 500 scanned pages for all 22 languages (13 scripts)

Y2-Q1

Create 30K En-X parallel sentences (fine-tuning data) for all 22 languages

Create 100 hours of ASR data for all 22 languages

Create 10 hours of TTS data for for all 22 languages

Create a benchmark for Scene Text Recognition containing additional 500 images for all 22 languages (13 scripts)

Create a benchmark for document OCR containing 500 scanned pages for all 22 languages (13 scripts)

Y2-Q2

Create 40K En-X parallel sentences (fine-tuning data) for all 22 languages

Create 100 hours of ASR data for all 22 languages

Create 10 hours of TTS data for for all 22 languages

Y2-Q3

Create 100 hours of ASR data for all 22 languages

Create 5K QA pairs for all 22 languages

Create 5K NER tagged sentences for all 22 languages

Create 5K sentiment labeled sentences for all 22 languages

Y2-Q4

Create 100 hours of ASR data for all 22 languages

Create 5K QA pairs for all 22 languages

Create 5K NER tagged sentences for all 22 languages

Create 5K sentiment labeled sentences for all 22 languages

Create 100K translated QA pairs (noisy training data) for all 22 languages

Create 100K noisy NER sentences (translation + projection) for all 22 languages

Create 100K translated SA sentences for all 22 languages

Y3-Q1

Build and release version 1 of ASR, TTS, MT, NLU models for P1 languages

Y3-Q2

Build and release version 1 of ASR, TTS, MT, NLU models for P2 languages

Y3-Q3

Build and release version 2 of ASR, TTS, MT, NLU models for P1 languages

Y3-Q4

Build and release version 2 of ASR, TTS, MT, NLU models for P2 languages

There are no rows in this table

⁠

Translation 100 x 11 x 21 x 1000 bitext pairs - ~20M → 100M (Ours) → 230M

ASR 500 hours x 22 - 1000 h → 10000 h

TTS 40 hours x 22 - 700 h → 800 h

OCR 100 x 22 x 1000 images - ? → 2M

NLU 330 x 22 x 1000 annotations - ? - 60M

Gallery

Want to print your doc?
This is not the way.

Try clicking the ⋯ next to your doc name or using a keyboard shortcut (

CtrlP

) instead.