Sixth European Conference on Speech Communication and Technology
DDBHMM solved the defects of traditional HMM. Based on DDBHMM, the problem of how to effectively utilize the duration information is studied in detail. The approach on estimating the duration distribution is introduced firstly, then the data file is classified according to the speak rate. The recognition experiment shows that, the duration information behaves best on the data of low speak rate, behaves normal on the data of medium speak rate and has little effect on the data of fast speak rate. Therefore, the most importance of duration is that by it the more accurate state segmentation point could be obtained and then the recognition rate can be improved. At the same time, the robustness of the system to speaking rate is improved with the employment of the duration information. Furthermore, the method of classified duration and normalized duration is also put forward and studied in detail, it shows that both of the two method can improve the effect. In order to study the dependency between the duration, the method of using the Bigram of the duration is proposed and analyzed. At last, the approach of post processing duration is studied, it shows that nly based on DDBHMM, and utilizing the duration information synchronously in the recognition process, then the performance can be improved greatly.
Full Paper (PDF)
Bibliographic reference. Zhao, Qingwei / Wang, Zuoying / Lu, Dajin (1999): "A study of duration in continuous speech recognition based on DDBHMM", In EUROSPEECH'99, 1511-1514.