Explore

Deep Learning Hiring Challenge

Vaibhav Saxena

Time for completion : 4 days

Hi there, thank you so much for showing your interest in deep learning intern at Synth.

As part of the task you need to:

Finetune Whisper largev2/medium/tiny/base on the following dataset (

https://huggingface.co/datasets/google/fleurs/viewer/hi_in/train⁠

). Only use the training dataset. Incase you use the validation set that would be pretty obvious to us because Whisper overfits very easily. You can use Google Colab to finetune Whisper.

Once you have finetuned Whisper, you should deploy and create an API endpoint which can be used to transcribe audio files in mp3 format.

You can use limited free services like

https://www.banana.dev/⁠

https://www.cerebrium.ai/⁠

to deploy and create an API inference endpoint. They offer some free hours/credits.

You can augment the dataset with more “Hindi” data. (highly recommended)

Evaluation:

We have a validation dataset other than the one mentioned in the above link on which we will run your finetuned model to test the accuracy.

You will be evaluated on the following:

The inference results accuracy on our Hindi dataset,

The inference speed

Various techniques used to fine-tune whisper and data pre-processing.

Deliverables:

You should share Google Colab notebook and show the work. Also share the API inference endpoint.

In the note book please document and explain why and how you took each step.

Share the work on

vaibhav@usesynth.com⁠

. For any queries please reach out to me:

vaibhav@usesynth.com⁠

⁠

Want to print your doc?
This is not the way.

Try clicking the ··· in the right corner or using a keyboard shortcut (

CtrlP

) instead.