7th International Conference on Spoken Language Processing

September 16-20, 2002
Denver, Colorado, USA

Refocussing on the Text Normalisation Process in Text-to-Speech Systems

Andrew Breen, Barry Eggleton, Peter Dion, Steve Minnis

Nuance Communications, UK

Many Natural Language Processing applications depend crucially on the front end processes that handle the input text and transform it into a form usable by the more "sophisticated" linguistic component of the applications. Despite this crucial role, often these front end processes are considered uninteresting, yet it is not unusual for the perception of the complete application to be affected by this weakest link in the processing chain.

With the recent productisation of many text to speech (TTS) systems, the performance of the TTS front end process, typically called the text normalization (TN) process, has been highlighted. This component performs sentence recognition, symbol and term expansions and word tokenisation - but these tasks are not independent. For this reason, enhancing TN coverage often has adverse side-effects, especially when dealing with unrestricted text, so a crucial part of our Nuance Vocalizer 2.0 TTS system development concerns itself with comprehensive regression testing of coverage.

As TTS systems are increasingly employed as part of general application suites, the TN component becomes the main interface with the controlling applications. Detailed specification of this interface is required, which lends itself to testing. Preprocessors, such as SSML transducers and email filers should ensure that no information is lost in subsuming some of the tasks that TN would normally undertake. Refocusing attention on the TN process and its testing is timely and can have important dividends.


Full Paper

Bibliographic reference.  Breen, Andrew / Eggleton, Barry / Dion, Peter / Minnis, Steve (2002): "Refocussing on the text normalisation process in text-to-speech systems", In ICSLP-2002, 153-156.