It’s been a while since I’ve played with the open source Festival TTS software, and I’m pretty impressed with the quality of the speech output. Some of the voices that are available sound so much better than the old diphone-based voices that evoke WOPR from War Games.
This got me thinking it’d be fun to integrate some of this functionality into a web application. A quick search and I discovered Tony Bhimani’s Linux Text-To-Speech Tutorial which has a sample PHP application that uses the Festival text2wave utility and the lame mp3 encoder to produce mp3 files from user submitted text.
I mentioned that some of the voices are pretty outstanding. In particular, the “unit selection” voices, demonstrated on the Festival demo page, are able to synthesize a lot of sentences with few noticeable glitches. These voices sound so nice because they contain a much larger database of common sound units, only falling back on heavy processed output on less common utterances. There’s a howto and discussion over on Ubuntu Forums that’ll guide you through installing and using the more enhanced voices with Festival. With a decent voice file, Festival, and an adaptation of Tony’s PHP text-to-speech demonstration, it wouldn’t be too hard to add audio output to your blog or create a script that turns your RSS feeds into a podcast for the daily commute.
Have any of your own text-to-speech ideas or demos? Please share them in the comments!