My Photo

« Time and Spam | Main | Earth Observatory »

August 09, 2005

TrackBack

TrackBack URL for this entry:
http://www.typepad.com/services/trackback/6a00d8341c994053ef00d8345248c653ef

Listed below are links to weblogs that reference Auto-podcasting, Part 2:

Comments

Jason Dowdell

It's quite possibly the worst TTS [text to speech] rendering I've ever heard. I understand the TTS engine doesn't have all of the grammars the BBC reporters may be using in their stories but the engine being used is awful. I'm wondering what tts engine they're actually using.

Also, I researched something similar some time ago but the main problem with using a TTS engine to render text into a podcast in an automated fashion is the licensing fees involved in doing so. If you use an open source engine then it's not a problem but the audio sucks and is barely understandable. If you go with a mainstream engine with decent voice renderings and all the grammars then you're looking at something like ScanSoft which gets extremely pricey and the novelty wares off quickly since the audio still isn't as good as a human and you're paying for it.

Matthew Hurst

Jason,

You should have a listen to Cepstral (http://www.cepstral.com)'s engine. I was impressed with it when I tried it out with creating speech from an earlier post.

Ted Gilchrist

Jason,

My first reaction is to say, "Then you haven't heard very many text to speech engines.". The Festival text-to-speech engine has been developed by serious, respected text-to-speech researchers, and has been around for a long time. It makes a serious attempt to incorporate knowledge about language at a number of levels, from phonetics to prosody, and I think the out of the box voices are more than adequate for "easy listening".

Besides, it's hard to beat the price. Personally, my primary interest is to explore the possible uses of text-to-speech technology, and not fret a lot about the voice quality, at this point. I am thrilled that a solid open source offering is available to me.

I pin my hopes on the theory that if you have interesting content, then people will adjust their ample speech recognition facilities to the voices, and master them.

Trust me. After hundreds of hours of listening, you stop noticing.

Ted Gilchrist

Ted Gilchrist

On the other hand, I just listened again to Cepstral. Yeah, it sounds pretty good. Time to think about allocating funds for a license.

The comments to this entry are closed.

Twitter Updates

    follow me on Twitter

    September 2014

    Sun Mon Tue Wed Thu Fri Sat
      1 2 3 4 5 6
    7 8 9 10 11 12 13
    14 15 16 17 18 19 20
    21 22 23 24 25 26 27
    28 29 30        

    Categories

    Blog powered by Typepad