In this paper, we compare speech recognition performance using broad phonetically- and acoustically-motivated units as a preprocessor in designing a novel noise robust landmark detection and segmentation algorithm. We introduce a cluster evaluation method to measure acoustic unit cluster quality. On the noisy TIMIT task, we find that the acoustic and phonetic segmentation approaches offer significant improvements over two baseline methods used in the SUMMIT segment-based speech recognizer, a sinusoidal model method and a spectral change approach. In addition, we find that the acoustic method has much faster computation time in stationary noises, while the phonetic approach is faster in non-stationary noise conditions.
Bibliographic reference. Sainath, Tara N. / Zue, Victor (2008): "A comparison of broad phonetic and acoustic units for noise robust segment-based phonetic recognition", In INTERSPEECH-2008, 2378-2381.