European Conference on Speech Technology

Edinburgh, Scotland, UK
September 1987

A Comparison Between Vocal Tract and Auditory Feature Analysis in ASR Systems

Gu Yong, John S. D. Mason

Dept. of Elec. Eng, University College of Swansea, Swansea, UK

A perceptuaiiy based linear predictive (PLP) speech analysis technique has been developed by Hermansky [1]. in this paper we discuss applications of PLP analysis and investigate the performance by comparison with standard linear predictive (LP) analysis in an automatic speech recognition (ASR) system. The ASR system is based on dynamic time warping. A vocabulary consisting of the alphabet and zero-through-nine is used for tests. In the first experiment, three distance measurements, the log likelihood ratio (LLR), the cepstral (CEP) and the root-power sums (RPS), are used to make a comparison for both PLP and LP. It shows that RPS distance measurement has the best performance for PLP. In the second experiment, the comparison between PLP and LP analysis is evaluated for various orders, using the RPS distance measure. It is shown that using PLP of order 5 gives better performance than conventional LP order 10 in spoaker-dependent recognition giving at least 2:1 reduction in data and in processing requirements.

Full Paper

Bibliographic reference.  Yong, Gu / Mason, John S. D. (1987): "A comparison between vocal tract and auditory feature analysis in ASR systems", In ECST-1987, 1132-1135.