|
||||
FAQ |
||||
Frequently Asked Questions Below you'll find an index to our frequently asked questions list. If have a question that you don't see on this list, feel free to Ask Us. Speech Recognition
Text To Speech
Engines, APIs and Markups
What is speech recognition, speech reco or ASR?
How good is it, really?
Why do I have to train some speech engines?
Who has the best speech engine?
What is speech recognition over the telephone? Because speech telephony servers answer calls from anyone, they are speaker-independent, grammar-based systems. You cannot just say anything into the phone, you must say something that the server expects. For example, if the server asks "Would you like stock information, news reports, or weather?", your response can be either "stock information", "news reports", or "weather." You cannot say, for example, "Hi server, how about the weather today?"
What is speech recognition on the Web?
What is dictation-based versus grammar-based recognition? Grammar-based recognition, on the other hand, typically does not require training to achieve high accuracy rates and has a small to medium vocabulary. Because of the vocabulary size, it is easier for the engine to determine what you've said, as long as it is within the context of what the grammar provides. For example, if the words "apple" and "orange" are the only words in the grammar, you can only say "apple" or "orange" and not "carrot." Typically, grammar-based recognizers will be able to tell you that you've said something that isn't in their vocabulary, but not be able to repeat it back to you.
How can I add speech to my application?
Text To Speech
What is TTS or Text-To-Speech?
What are the benefits to using TTS? When delivering TTS over the Internet, or an intranet, TTS saves bandwidth by sending only the text of what needs to be said, instead of large wave files. Wave files can consume hundreds of kilobytes of memory versus a few for a long string that needs to be spoken. The downside to TTS is that you give up quality compared to using prerecorded audio.
What makes good TTS and what makes it bad?
All of these things factor into what you might call good TTS. Bad TTS is not having a combination of the previous elements.
What is concatenative TTS versus synthesized TTS?
Can I hear some TTS? Engines, APIs and Markups
What is SAPI?
What is JSAPI? More recently, JSAPI2 was released through the Java Community Process (JCP) as JSR 113. This new version of the Java Speech API works on Java ME (Mobile Edition). You can find a link to JSAPI2 through our related sites page.
What is JSML? <JSML><PROS Vol="100"><PROS Rate="10"><PROS Pitch="-25">Hello World!</PROS></PROS></PROS></JSML> The string "Hello World!" would be spoken with a volume of 100 (much louder), a rate of 10 (slightly faster) and a pitch of -25 (lower pitched). Different JSAPI implementations may differ on what those numbers mean. |
© 2001 - 2021, EverSpeech, Inc.
Comments or questions to:
webmaster@EverSpeech.com