Matthew Kastor: TTS Podcast is Easy

Want a robot to read your blog? Want to embed an mp3 player in your blog so your fans and friends can listen to it? These instructions will show you how.
TTS Podcast is Easy.mp3

Stats

Difficulty: Easy
Setup Time: 15 minutes
Generating the audio file and embedding it: 3 minutes

Overview

Recently, I decided that I wanted to generate a podcast for one of my other blogs. I did a quick google search and found a few sites that claimed they would generate a podcast from my blog's RSS feed. Sounds good right? Right. The thing is, none of them worked. Their RSS readers kept telling me that my feed was invalid. Maybe it was because I've got the hammer and sickle symbol (☭) scattered everywhere and their feed readers choked? I don't know. At any rate, most of them were using the same voices I had heard in opens source TTS programs before so, I figured I'd just do it myself.

Quickly, the process goes like this:

Run your text through eSpeak and generate a WAV file
Convert the WAV file to mp3 using Audacity
Upload the mp3 file to some hosting site
Embed an mp3 player into your blog post and load the recording into it.

Easy peasy.

Step By Step Instructions

eSpeak

So first thing's first. You'll need to download eSpeak from http://espeak.sourceforge.net/ and install it. Installation is straightforward. The only thing you might want to do (on Windows) is enable extra voices. When you get to the screen that asks if you want to enable extra voices paste the following in:

en en-us en-us+m1 en-us+m2 en-us+m3 en-us+m4 en-us+m5 en-us+m6 en-us+m7 en-us+f1 en-us+f2 en-us+f3 en-us+f4 en-us+f5 en-us+croak en-us+whisper en-us+klatt en-us+klatt2 en-us+klatt3 en+m1 en+m2 en+m3 en+m4 en+m5 en+m6 en+m7 en+f1 en+f2 en+f3 en+f4 en+f5 en+croak en+whisper en+whisperf en+klatt en+klatt2 en+klatt3 en+klatt4

This will give you both English voices with all the variants. Per the notes in the readme.txt which isn't available until after you install eSpeak:

The available Voices can be seen as files in the directory
  espeak-edit/voices.

To change which eSpeak Voices are available through
Windows, re-run the installer and specify the Voice files
which you want to use.

The tone of a Voice can be modified by adding a variant
name after the Voice name, eg:
  pt+f3

The available variants are:
Male:    +m1  +m2  +m3  +m4  +m5  +m6  +m7
Female:  +f1  +f2  +f3  +f4  +f5
Other effects:  +croak  +whisper
A different synthesizer method: klatt  klatt2  klatt3

These variants are defined by text files in
  espeak-edit/voices/!v

Once eSpeak is installed you'll have a new program in your start menu called TTSApp that you can use to have the robot read things. It's easy to use, just paste text in the box and push the speak button to hear it. When you've picked the voice you want and added punctuation to make it sound right, click the save to .wav button to generate an audio file. Alternatively, you can use the command line interface if you're into that. You'll find it in your program files folder under eSpeak/command_line. Run eSpeak -h from there to get a list of options. If you really want to tweak the way your text is read then I recommend using the command line as it offers you many options. As a bonus, after you get the command, with all it's options the way you want, you can save it in a batch file that takes two file names as arguments: one for the text file to read and the other for the audio file to generate. This is getting into advanced territory though, you don't have to make things so hard.

Audacity

Next, you'll need to install audacity to convert the audio file from WAV to mp3. Download audacity from http://audacity.sourceforge.net/ and install it. Installation is straightforward, there isn't anything special to do. Once Audacity is installed you'll have a new program called ... Audacity, in your start menu. Run it. Converting a WAV file to mp3 is easy, just open the WAV file in Audacity (file->open) and then export it as mp3 (file->export). The first time you try to export an mp3, Audacity will ask you to install the LAME encoder. Just follow the directions given and then try to export again. If you've successfully installed the encoder then you'll get an mp3 file. Don't worry, the lame encoder is another easy to install, no setup kind of program.

mp3 hosting

We're going to get a link to use in the mp3 player. If you've got a hosting account for a website you can upload your mp3 to some publicly accessible space there and just copy the web address to the file. If you don't have your own website and if you're a Google+ user, you have access to hosting from Google drive. Just upload the mp3 to your google drive's public folder and use https://googledrive.com/host/<folder id>/<file name> Where <folder id> and <file name> are to be replaced with the folder id and file name from your google drive. Knowing the file name of your file is easy, you named it. The folder id is a little trickier but it's easy to get as well. Just go into drive, right click the folder that your mp3 file is in, and click share. Once the sharing window is open you'll see a link for sharing the folder. Copy the link into notepad and you'll see that there is a part of the link that says id=<a bunch of numbers, underscores, and dashes>&usp=sharing. Everything between id= and the very next & sign is the folder id. If you're not a Google+ user you should sign up. They've got all kinds of awesome things you can do and use.

Embedding the player

HTML5 introduces the Audio tag for playing audio files. While the file formats supported by different browsers varies, and handling errors isn't stratightforward, it is likely your best bet. As time goes on this will be the best supported method since it is a standard being implemented in every browser worth using at all.

I have written a small JavaScript program that will search through a webpage for links to mp3, ogg, and wav files, then inject an audio player with error handling and all that fancy business for you. All you have to do is include my JavaScript in your page and call it's function after the page has loaded. It's easy:

Download the JavaScript from https://github.com/matthewkastor/audio-tag-injector/archive/master.zip
Unzip the downloaded file and include the code from audio-tag-injector.js, located in the src folder, into your webpage.
Call the function audioTagInjector either from the body tag's onload property, or insert a script just before the closing tag for the body to do so.

The example.html file included in the download shows you exactly how to do it in a single web page. If you would like to add it to your blogger blog globally, so that any time you include a link to an audio file it will inject a player automatically, you'll have to edit your template. It is also easy:

Get into the template editor, you'll want to edit the raw HTML of your template. Go into your blogger dashboard at http://blogger.com, select the blog you want to edit, click "template" on the left side of the screen, then find the button to "Edit HTML".
In the editor pressing ctrl+f will bring up a search box, search for /body to find the closing tag of the body element.
Just before the closing tag for the body, create a script element and paste the code from audio-tag-injector.js into it. Then add one additional line to the end of the code: audioTagInjector();
Save the template.

It looks like:

<script>

// paste the code from audio-tag-injector.js here

audioTagInjector();
</script>

</body>

After you've added the code check any of your blog posts with links to audio files in them. If you've done everything right you'll have an audio player in your blog that plays a recording of your Awesome words in a robot voice!!

Have fun. :D

Matthew Kastor

Labels

TTS Podcast is Easy