European Conference on Speech Technology
Edinburgh, Scotland, UK
This project integrates different parts of a speaker-dependent, isolated-word Voice Activated Typewriter on a Personal Computer (IBM PC-AT). In order to build up the language model (for French), several routines have been written: automatic grapheme-to-phoneme conversion, semi-automatic training texts (20 pages) processing (building up the Graphemic (2,500 words) and Phonemic (2,000 words) lexicons, syntactic labelling through inductive inference), computation of the probabilistic language model (bigrams and trigrams), definition of the phonological rules. The speech signal is analysed by 20 digital band-pass filters. Several types of speech compression techniques have been compared on medium and large difficulty vocabularies. Vector Quantization and Non-Linear Time Compression have been choosen. Recognition is conducted in 3 steps: i) Fast Match based on word length and gross comparison, ii) Detailed match based on conventional DTW algorithms. iii) Use of the language model to take into account the linguistic constraints, and to achieve the grapheme-to-phoneme conversion. Overall recognition rates of 95% have been obtained with a mean recognition time of 2 s., the 2,000 templates being stored in 60 KBytes of RAM memory.
Bibliographic reference. Mariani, Joseph J. (1987): "Hamlet: a prototype of a voice activated typewriter", In ECST-1987, 2222-2225.