September 22-25, 1997
One of the most difficult problems in the first stages of automatic speech recognition (ASR) is the identification of consonantal place of articulation (CPA). It is known that the acoustic correlates for CPA reside largely in the pattern of formant transitions preceding v ocal tract closure and following release, b ut common speech preprocessing techniques make only a limited attempt to capture these spectral dynamics in the representation which they pass on for recognition. In order to test alternative preprocessing strategies, we have prepared a multilingual set of VC and CV vocalic transition segments and then compared the baseline performance of human perception of CP A in this dataset with the performance of tw o common ASR techniques. Representaions initially tested were concatenated mel cepstra and mel ceptra plus cepstral differences.
Bibliographic reference. Morris, Andrew C. / Bloothooft, Gerrit / Barry, William J. / Andreeva, Bistra / Koreman, Jacques (1997): "Human and machine identification of consonantal place of articulation from vocalic transition segments", In EUROSPEECH-1997, 2123-2126.