ISCA Archive Interspeech 2005
ISCA Archive Interspeech 2005

Analysis of spectral space reduction in spontaneous speech and its effects on speech recognition performances

Masanobu Nakamura, Koji Iwano, Sadaoki Furui

Although speech, derived from reading texts, and similar types of speech, e.g. that from reading newspapers or that from news broadcast, can be recognized with high accuracy, recognition accuracy drastically decreases for spontaneous speech. This is due to the fact that spontaneous speech and read speech are significantly different acoustically as well as linguistically. This paper analyzes differences in acoustic features between spontaneous speech and read speech using a large-scale spontaneous speech database "Corpus of Spontaneous Japanese (CSJ)". Experimental results show that spontaneous speech can be characterized by reduced size of spectral space in comparison with that of read speech. It has also been found that there is a strong correlation between mean spectral distance between phonemes and phoneme recognition accuracy. This indicates that spectral reduction is one major reason for the decrease of recognition accuracy of spontaneous speech.


doi: 10.21437/Interspeech.2005-868

Cite as: Nakamura, M., Iwano, K., Furui, S. (2005) Analysis of spectral space reduction in spontaneous speech and its effects on speech recognition performances. Proc. Interspeech 2005, 3381-3384, doi: 10.21437/Interspeech.2005-868

@inproceedings{nakamura05b_interspeech,
  author={Masanobu Nakamura and Koji Iwano and Sadaoki Furui},
  title={{Analysis of spectral space reduction in spontaneous speech and its effects on speech recognition performances}},
  year=2005,
  booktitle={Proc. Interspeech 2005},
  pages={3381--3384},
  doi={10.21437/Interspeech.2005-868}
}