Mismatched Crowdsourcing from Multiple Annotator Languages for Recognizing Zero-Resourced Languages: A Nullspace Clustering Approach

Wenda Chen, Mark Hasegawa-Johnson, Nancy F. Chen, Boon Pang Lim


It is extremely challenging to create training labels for building acoustic models of zero-resourced languages, in which conventional resources required for model training — lexicons, transcribed audio, or in extreme cases even orthographic system or a viable phone set design for the language — are unavailable. Here, language mismatched transcripts, in which audio is transcribed in the orthographic system of a completely different language by possibly non-speakers of the target language may play a vital role. Such mismatched transcripts have recently been successfully obtained through crowdsourcing and shown to be beneficial to ASR performance. This paper further studies this problem of using mismatched crowdsourced transcripts in a tonal language for which we have no standard orthography, and in which we may not even know the phoneme inventory. It proposes methods to project the multilingual mismatched transcriptions of a tonal language to the target phone segments. The results tested on Cantonese and Singapore Hokkien have shown that the reconstructed phone sequences’ accuracies have absolute increment of more than 3% from those of previously proposed monolingual probabilistic transcription methods.


 DOI: 10.21437/Interspeech.2017-1567

Cite as: Chen, W., Hasegawa-Johnson, M., Chen, N.F., Lim, B.P. (2017) Mismatched Crowdsourcing from Multiple Annotator Languages for Recognizing Zero-Resourced Languages: A Nullspace Clustering Approach. Proc. Interspeech 2017, 2789-2793, DOI: 10.21437/Interspeech.2017-1567.


@inproceedings{Chen2017,
  author={Wenda Chen and Mark Hasegawa-Johnson and Nancy F. Chen and Boon Pang Lim},
  title={Mismatched Crowdsourcing from Multiple Annotator Languages for Recognizing Zero-Resourced Languages: A Nullspace Clustering Approach},
  year=2017,
  booktitle={Proc. Interspeech 2017},
  pages={2789--2793},
  doi={10.21437/Interspeech.2017-1567},
  url={http://dx.doi.org/10.21437/Interspeech.2017-1567}
}