Explore

Audio Splitter

Using Videopipe to split your audio into speech and music.

🧰 What is this feature?

The audio splitting model takes a video file (.mp4) and separates the extracted audio into speech and music tracks. In the end, you would get two audio files - one for each source.

🔧 How to use it?

1️⃣ Export audio: Use AVID media composer to send a video using the following ‘send to playback profiles’. Make sure to remember the filename of the input.

RTL Nieuws Online Editorial Team: NIEUWS_AZURE_DI_SPLITAUDIO

Promo Editors: PROMO_AZURE_DI_SPLITAUDIO

Please make sure that there are no spaces in the filename. Example of some correct filenames are audio_split_doc.mp4, audio_split_doc.mp4. Filenames like audio split doc.mp4 will not work.

⁠

2️⃣ Wait for processing: Grab a ☕ and wait 8 to 20 minutes for the audio to be processed. Longer input files will take longer to process.

⁠

3️⃣ Download audio files: You will find the separated audio files in a file folder you can access, where filename is the filename of the audio you sent in step 1. The folder location is as mentioned on the right →

RTL Nieuws Online Editorial Team:

Seperated speech: \\rtl-isilon\AVID_DROPBOX\DROPBOX\65_AUTOMATED_CONTENT\SPLITAUDIO\filename_vocals.wav

Seperated speech: \\rtl-isilon\AVID_DROPBOX\DROPBOX\65_AUTOMATED_CONTENT\SPLITAUDIO\filename_music.wav

Promo Editors:

Separated speech: \\rtl-isilon.pp.local\avid_dropbox\dropbox\promo\automated_content\splitaudio_promo_di\filename_vocals.wav

Separated music: \\rtl-isilon.pp.local\avid_dropbox\dropbox\promo\automated_content\splitaudio_promo_di\filename_music.wav

⁠

❌ Did not receive the files?

Things can go wrong, even machines make mistakes. Did you wait 20 minutes and are the audio files are still not in your dropbox location? Here’s what you can do:

✉️ Contact Data Science: Ask in the slack channel

#videopipe-for-editors⁠

what the status is on the audio, make sure to mention the filename in the message so we know what to look for. A Data Scientist will try to help you out as soon as possible!

⁠

🤖 A little note of appreciation for the model:

The model behind the audio splitter is Demucs (Deep Extractor for Music sources) by Facebook/Meta.

The model starts by identifying different patterns in the waveform. Once the basic patterns are laid out, it tries to build a higher-level structure around it to see which pattern belongs to which instrument or speech. With this information, the model can then carefully separate out these components. You can read more about the model

here⁠

⁠

Want to print your doc?
This is not the way.

Try clicking the ··· in the right corner or using a keyboard shortcut (

CtrlP

) instead.