In recent years, acoustic-phonetic features (APF) have received great interest as a replacement for phones in automatic speech recognition (ASR) systems. Many studies have focused on improving feature sets and acoustic parameters to describe the APFs. Invariably, these are developed and tested on a limited number of well-researched databases containing read speech. When tested on conversational speech data, these improved APFs and acoustic parameter sets, however, do not show the same improvement. In two experiments, we show that this approach does not work because some of the basic assumptions (here: segmentation in terms of phones) that work well for read speech do not work for conversational speech. More generally speaking, our studies suggest that we need to take the nature of our application data into account already when building the concepts, when defining the basic assumptions of a method, and not only when applying the method to the application data.
Index Terms: acoustic-phonetic feature classification, conversational speech, support vector machines
Cite as: Schuppler, B., Doremalen, J.v., Scharenborg, O., Cranen, B., Boves, L. (2013) The challenge of manner classification in conversational speech. Proc. Speech Production in Automatic Speech Recognition (SPASR-2013), 11-15
@inproceedings{schuppler13_spasr, author={Barbara Schuppler and Joost van Doremalen and Odette Scharenborg and Bert Cranen and Lou Boves}, title={{The challenge of manner classification in conversational speech}}, year=2013, booktitle={Proc. Speech Production in Automatic Speech Recognition (SPASR-2013)}, pages={11--15} }