ISCA Archive Interspeech 2008
ISCA Archive Interspeech 2008

Continuous phone recognition without target language training data

Dau-Cheng Lyu, Sabato Marco Siniscalchi, Tae-Yoon Kim, Chin-Hui Lee

Designing an automatic speech recognition system with little or no language-specific training data is a challenging research topic because collecting abundant speech training data is not always an easy job for all possible languages of interest. According to our previous studied detection-based paradigm, we used a set of 21 acoustic phonetic attributes shared by five languages to perform Japanese phone recognition without using any Japanese speech training data. In this paper, we address the key issue of designing attribute-to-phone mapping models by two techniques: (1) a phone-based background model for each of the speech attribute detector to improve attribute detection; and (2) a data-driven clustering algorithm to group attribute-to-phone mapping rules of known languages to predict such rules for target phones in an unseen language. We report on experimental results of continuous Japanese phone recognition with the OGI Multilingual Speech Corpus and show that the proposed approach indeed decreases the false rejection rate of attribute detection, and improves the phone recognition accuracy.

doi: 10.21437/Interspeech.2008-666

Cite as: Lyu, D.-C., Siniscalchi, S.M., Kim, T.-Y., Lee, C.-H. (2008) Continuous phone recognition without target language training data. Proc. Interspeech 2008, 2687-2690, doi: 10.21437/Interspeech.2008-666

  author={Dau-Cheng Lyu and Sabato Marco Siniscalchi and Tae-Yoon Kim and Chin-Hui Lee},
  title={{Continuous phone recognition without target language training data}},
  booktitle={Proc. Interspeech 2008},