![]() |
Speech Recognition and Intrinsic Variation (SRIV2006)Toulouse, France |
![]() |
The present study investigates whether there are word sequences that exhibit considerable deviation in pronunciation, such that they might require special treatment in speech technology, so called multiword expressions (MWEs). The results show that these sequences exist, that they are frequent and that they are often extremely reduced. In order to be studied, MWEs have to be identified in the first place. We investigate how such sequences can be automatically detected in a corpus of spontaneous speech. Measures that are known to be related to predictability and phonetic reduction are employed for this purpose. Our findings indicate that these measures yield different results and that a combination of criteria would probably be most effective.
Bibliographic reference. Strik, Helmer / Elffers, A. / Bavcar, D. / Cucchiarini, Catia (2006): "Half a word is enough for listeners, but problematic for ASR", In SRIV-2006, 101-106.