Third International Conference on Spoken Language Processing (ICSLP 94)
This paper proposes a new double talk detection method for improving the man-machine interface of speech dialog systems. Echo cancelers are well known to be useful in detecting double talk. However, in order to use an echo canceler effectively, correct learning data which contain original speech data and echo speech data with-out double talk, are necessary. If other speech data is included in the echo speech data, correct learning is obstructed. In order to distinguish between correct learning data and incorrect data, the difference in the logarithmic power values of the output speech and the input speech, Q, is introduced. Firstly, the two main parameters are defined. They are the attenuation factor (a) from the system out-put to the system input through the hybrid circuit and the attenuation factor (/3) from the telephone line to the system input through the hybrid circuit. Secondly, the formula of Q in each state is described using a\ and /3. Then the discrimination probability functions are introduced using the difference of Q distributions in each state. Thirdly, the experiment and the results of Q distributions in each state are described. Finally, the evaluation results of the new method are shown. The method can select learning speech data for the echo canceler learning at 94 % accuracy from a 3-second speech input.
Bibliographic reference. Nishi, H. / Kitai, M. (1994): "Analysis and detection of double talk in telephone dialogs", In ICSLP-1994, 1623-1626.