Gallery
AI4Bharat
Share
Explore
IndicASR

Multilingual ASR

Things to do

Accuracy
Augmentation in fine-tuning data <- read papers
Proper noun hack <- talk to Harveen
Adding hot words / domain specialization <- read papers, engineering efforts
Evaluating multilingual model options
Evaluating model size, batch size, ...
Latency
Time in AM vs LM <- measure
Reduce LM time with smaller LM <- compare accuracy, latency
Evaluating all preprocessing steps and then applying them to fine-tuning and pre-training data

Multilingual Finetuning results
3
Model Name
Language
odia
63
bengali
63
telugu
87
gujarati
84
hindi
65
marathi
69
tamil
84
tamil_32_2_-1
20
odia_32_2_-1
15
telugu_32_2_-1
20
bengali_32_2_-1
15
marathi_32_2_-1
15
gujarati_32_2_-1
20
hindi_32_2_-1
16
4
Test set
dcunk_new
21
dckn_new
21
mucs
21
dcunk_new
21
dckn_new
21
openslr
21
dcunk_new
22
dckn_new
22
mucs
21
msr
22
dcunk_new
21
dckn_new
21
mucs
21
msr
21
dcunk_new
22
dckn_new
21
mucs
22
dcunk_new
24
dckn_new
21
mucs
24
dcunk_new
21
dckn_new
21
mucs
21
msr
21
dcunk_new
5
dckn_new
5
mucs
5
msr
5
dcunk_new
5
dckn_new
5
mucs
5
dcunk_new
5
dckn_new
5
mucs
5
msr
5
dcunk_new
5
dckn_new
5
openslr
5
dcunk_new
5
dckn_new
5
mucs
5
dcunk_new
5
dckn_new
5
mucs
5
msr
5
dcunk_new
5
dckn_new
5
mucs
6
4
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
WER
CER
base_existing_test4
48
72
20.04
71.1
19.28
33.73
7.48
47.38
15.63
49.76
16.23
30.99
11.27
44.33
10.02
43.6
9.85
40.46
9.92
33.19
7.21
26.16
6.96
33.21
9.38
34.29
12.02
27.49
8.75
30.63
9.63
27.14
8.87
18.67
6.39
74.03
21.93
69.46
18.84
29.57
7.42
36.56
6.22
39.86
7.32
35.64
7.88
29.42
5.85
22.5
4.05
23.11
4.96
24.61
5.35
22.55
4.69
42.68
16.89
41.21
15.73
27.84
6.8
25.41
8.19
25.11
8.36
24.27
6.32
22.61
5.54
29.19
11.34
30.81
11.96
12.44
3.93
43.28
21.68
37.92
17.26
17.69
6.6
13.82
4.95
17.95
7.67
21.38
9.16
18.87
7.49
12.06
5.61
11.54
5.31
14.33
5.56
large_existing
48
57.34
13.52
57.08
13.17
28.77
4.64
35.55
12.18
37.83
12.99
25.85
9.79
37.61
7.87
36.15
7.3
35.8
8.35
29.2
6.14
21.86
5.75
26.2
7.08
29.59
10.08
23.63
7.38
23.86
7.05
20.05
6.17
16.38
5.25
58.69
15.08
56.32
13.29
20.74
4.26
33.13
5.37
34.38
5.91
32.5
6.93
26.99
5.25
21.43
3.62
21.42
3.93
23.59
4.94
21.91
4.45
32.46
9.56
32.74
9.25
24.25
4.19
22.33
5.99
21
5.44
22.63
5.59
21.28
4.97
23.5
7.83
25.17
8.43
10.95
3.34
31.06
12.74
28.6
10.41
13
3.88
12.05
3.91
14.22
5
19.02
7.6
17.58
6.72
9.25
3.83
8.58
3.37
12.97
4.56
base_bs2x_existing
24
78.28
23.39
78.17
22.72
37.81
9.95
46.29
15.52
48.07
15.67
27.65
10.29
40.31
8.63
39.74
8.63
37.36
9.08
30.21
6.55
24.38
6.7
31.67
9.18
32.35
11.24
25.65
8.14
29.86
9.61
26.95
8.98
19.34
6.66
74.62
22.83
71.18
19.49
36.59
9.33
34.86
5.93
36.79
6.72
33.36
7.31
27.25
5.32
base_steps2x_existing
24
81.94
25.65
81.77
24.8
39.23
10.96
47.08
16.12
50.47
16.69
27.79
10.28
40.84
8.98
41.5
9.75
37.77
9.1
30.35
6.52
26.87
7.43
33.87
9.97
32.88
11.43
25.76
8.17
28.76
9.49
27.79
9.31
19.79
6.82
74.44
23.73
71.05
20.45
38.53
10.92
34.6
5.88