1st Joint SIG-IL/Microsoft Workshop on Speech and Language Technologies for Iberian Languages
Porto Salvo, Portugal
While accurate speech recognition engines are critical to successful speech
applications, there are other factors than can impact user experience even more than
the accuracy of the engine itself. For example, the grammar the ASR engine uses
should predict what the user will say but itís often hard for an application developer
to design a grammar that will result in high system accuracy. I will show how datadriven
techniques can be used to build accurate grammars in a straightforward way.
Iíll also describe a technique that uses a statistical language model and an inverted
index and which can be used for applications such as voice search or SMS dictation
and results in high accurate end-to-end systems.
Even an accurate speech recognition system is not enough for good user experience because such systems will always make errors and itís critical to provide a graceful error recovery mechanism. Also, users have a choice between speaking, touching a screen, or typing and may choose to not speak unless this is better than the alternative. I will show designs for several systems that take into account this in voice search, education and the automobile.
Bibliographic reference. Acero, Alex (2009): "Building accurate and user-friendly speech systems", In SLTECH-2009, 3.