Batch processing files with Audacity

In my last post Using Text to Speech online tools to create audio files for Arduino projects I had some info about online text to audio sites I have been using to create audio alert files for projects. The files produced by ttsmp3.com seemed to have a lower audio level than I expected. I found I could run all the files through a batch process in Audacity to normalize them. I spent a lot more time trying to find out how to do it so I’m making this quick post so I have the info for next time.

I was running Audacity 2.1.3 which is now an old version. The terminology and location of the feature changed in 2.3.0. Essentially the pre 2.3 version seems to call it ‘chains’ whereas it became ‘macros’ in 2.3. I’ll include info for both as I made notes for the older versions first.

The relevant info is available on the Audacitiy site.

There are two parts to the process

  1. Create a chain/macro with the processes that will be run during the batch process. This only needs to be set up once.
  2. Select files and run the task.

There are a lot of other changes that can be included. I’ve also added Mono to one as I found some of my files created another way were stereo and others mono. My project only had a single speaker so I converted them all to mono for consistancy.

Pre 2.3 version

In these versions the batch process is referred to as a chain.

To create a chain

  • Go to File
  • Select Edit chains…
  • Add button
  • Set a name, e.g. Normalize to -0.1dB
  • Insert Normalize – Tweak settings if desired. I didn’t
  • Insert ExportMP3 – This is required otherwise the files are not saved
The chain should look something like this

To use the chain

  • Go to File
  • Select Apply chains…
  • Select the chain, in my case ‘Normalize to -0.1dB’
  • Select Apply to files
  • Browse to the file folder and choose files
  • Select Open
  • Converted files will be placed in a folder called cleaned inside the folder of the selected files

Current version

In later versions it is called a macro.

To create a macro

  • Go to Tools
  • Select Macros…
  • New button
  • Set a name, e.g. Normalize
  • Insert Normalize. There is also an option ‘Normalize (Macro_Normalize)’ that I have not tried

  • Insert Export as MP3 – This is required otherwise the files are not saved

To use the macro

  • Go to Tools
  • Select Macros…
  • Select the macro, in my case Normalize
  • Select Files
  • Browse to the file folder and choose files
  • Select Open
  • Converted files will be placed in a folder called macro-output inside the folder of the selected files

Before and after

Here is a before and after shot. The before is the one at the top.

As always if let me know if you find inaccuracies in my info.

Using Text to Speech online tools to create audio files for Arduino projects

I’ve made a couple of projects that play audio alerts using a dfplayer and MP3 files. One in a clock that plays announcements and the other a countdown timer. When I did those, I used onlinetonegenerator.com to convert text to speech. I liked the voices but it doesn’t have an option to save the audio as an MP3 file. I ended up using Audacity to record the computers audio. It worked but was very tedious. Since then, I’ve been looking for a simpler way.

The important criteria in a text to speech service for me is:

  • Ability to enter text, listen to the converted audio in the browser and then download it as an mp3 file.
  • Sufficient audio quality.
  • Volume level ok.
  • Suitable lead in and end dead time to allow multiple files to be played in sequence with the  result sounding as a smooth sentence. For example these four files together; “The time is”, “eleven”, “thirty two”, “am”.
  • A suitable voice. They don’t all have the same voices. I prefer some more than others.
  • Free or good value.
  • Ability to change speed, pitch and emphasis a bonus.

Here are a few I’ve looked at.

Online Tone Generator (onlinetonegenerator.com/voice-generator.html)

This is the first one I used. Its:

  • Free.
  • Has a good range of voices; mostly Google, plus a few Microsoft ones I presume are built into my Windows PC. I particularly like Google français.
  • Lots of other audio tools, including; noise generator, sweep generator, DTMF signals.
  • But, no MP3 download.

From Text To Speech (fromtexttospeech.com)

I briefly looked at this one. It is:

  • Free.
  • Has a 50 000 character limit per file.
  • I’m unable to listen in browser. No play button displays until a file is created and then the player presented appears to use flash and is blocked by my browser. The file can be downloaded and used.
  • MP3 download.
  • Different speed and voices.

TTSMP3 (ttsmp3.com)

This is the one that I am intending to use in my next project. Lots of features, MP3 file output and a fair amount of free usage. It is:

  • Free for 3,000 characters (~375 words) per day.
  • Lots of different voices.
  • Supports speed, pitch and other effects using tags.
  • Multiple voices can be used in the one piece of text by using tags.
  • MP3 file download.

TTSMP3 uses Amazon Polly and comes with quite a few voices and features. Additional effects can be used by using tags in your text. More info about tags is available on this Amazon page.

Here is an example of the voices.

That audio file was created by pasting the text below into the converter. Beware if you do this it will use up most of your daily 3000 word limit.

[speaker:Zeina] Hi, I'm Arabic Zeina
[speaker:Russell] Hi, I'm Russell Australian English Russell
[speaker:Nicole] Hi, I'm Australian English Nicole
[speaker:Camila] Hi, I'm Brazilian Portuguese Camila
[speaker:Ricardo] Hi, I'm Brazilian Portuguese Ricardo
[speaker:Vitória] Hi, I'm Brazilian Portuguese Vitória
[speaker:Emma] Hi, I'm British English Emma
[speaker:Amy] Hi, I'm British English Amy
[speaker:Brian] Hi, I'm British English Brian
[speaker:Chantal] Hi, I'm Canadian French Chantal
[speaker:Enrique] Hi, I'm Castilian Spanish Enrique
[speaker:Lucia] Hi, I'm Castilian Spanish Lucia
[speaker:Conchita] Hi, I'm Castilian Spanish Conchita
[speaker:Zhiyu] Hi, I'm Chinese Mandarin Zhiyu
[speaker:Mads] Hi, I'm Danish Mads
[speaker:Naja] Hi, I'm Danish Naja
[speaker:Ruben] Hi, I'm Dutch Ruben
[speaker:Lotte] Hi, I'm Dutch Lotte
[speaker:Céline] Hi, I'm French Céline
[speaker:Léa] Hi, I'm French Léa
[speaker:Mathieu] Hi, I'm French Mathieu
[speaker:Vicki] Hi, I'm German Vicki
[speaker:Marlene] Hi, I'm German Marlene
[speaker:Hans] Hi, I'm German Hans
[speaker:Karl] Hi, I'm Icelandic Karl
[speaker:Dóra] Hi, I'm Icelandic Dóra
[speaker:Aditi] Hi, I'm Indian English Aditi
[speaker:Raveena] Hi, I'm Indian English Raveena
[speaker:Carla] Hi, I'm Italian Carla
[speaker:Giorgio] Hi, I'm Italian Giorgio
[speaker:Bianca] Hi, I'm Italian Bianca
[speaker:Takumi] Hi, I'm Japanese Takumi
[speaker:Mizuki] Hi, I'm Japanese Mizuki
[speaker:Seoyeon] Hi, I'm Korean Seoyeon
[speaker:Mia] Hi, I'm Mexican Spanish Mia
[speaker:Liv] Hi, I'm Norwegian Liv
[speaker:Ewa] Hi, I'm Polish Ewa
[speaker:Jan] Hi, I'm Polish Jan
[speaker:Maja] Hi, I'm Polish Maja
[speaker:Jacek] Hi, I'm Polish Jacek
[speaker:Inês] Hi, I'm Portuguese Inês
[speaker:Cristiano] Hi, I'm Portuguese Cristiano
[speaker:Carmen] Hi, I'm Romanian Carmen
[speaker:Maxim] Hi, I'm Russian Maxim
[speaker:Tatyana] Hi, I'm Russian Tatyana
[speaker:Astrid] Hi, I'm Swedish Astrid
[speaker:Filiz] Hi, I'm Turkish Filiz
[speaker:Joey] Hi, I'm US English Joey
[speaker:Kimberly] Hi, I'm US English Kimberly
[speaker:Salli] Hi, I'm US English Salli
[speaker:Ivy] Hi, I'm US English Ivy
[speaker:Matthew] Hi, I'm US English Matthew
[speaker:Kendra] Hi, I'm US English Kendra
[speaker:Joanna] Hi, I'm US English Joanna
[speaker:Justin] Hi, I'm US English Justin
[speaker:Miguel] Hi, I'm US Spanish Miguel
[speaker:Lupe] Hi, I'm US Spanish Lupe
[speaker:Penélope] Hi, I'm US Spanish Penélope
[speaker:Gwyneth] Hi, I'm Welsh Gwyneth
[speaker:Geraint] Hi, I'm Welsh English Geraint

Comparisons

Compared with the original audio files that I created by using Audacity to record the PC audio and onlinetonegenerator.com, ttsmp3.com had lower volume. I may have had the record level a bit high when I used Audacity so not sure that the ttsmp3 level is too low.

The bit rate is also different, with the Audacity ones higher. That’s probably because I unnecessarily chose a higher bitrate in Audacity. TTSMP3 was 48kbs.

And that affected the file size. The TTSMP3 is much smaller.

Here are a couple of examples for comparison. For each I created four separate files and then joined them together to see how smooth the transition was. The four files were “The time is”, “11”, “32”, “AM”. I had to be a bit creative with the AM for French Celine as it was pronounced as “am”.

onlinetonegenerator.com Voice is Google français. I like this voice. It has added a lot of character to my speaking clock.

ttsmp3.com Voice is French Celine. It was much easier to create and the timing between files is ok, but the voice doesn’t have the same character to the one above in my opinion

ttsmp3.com This is British Amy. This was just for comparison to see how the same text would sound with an English voice.

Do you have other methods? Let me know.

Update 13/4/2021: I have added a post with info about how to increase the audio level in a batch of files Batch processing files with Audacity