ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Technique for automatic sentence level alignment of long speech and transcripts

Imran Ahmed, Sunil Kumar Kopparapu

A frugal approach to construct speech corpora, specially for resource deficient languages, is to exploit collections of speech and corresponding text data available in audio books, news, lectures. However, using these resources for building speech corpora require an alignment of the long duration speech data with the accompanying text data. Existing techniques for automatic speech-text alignment of long audio files assume availability of a basic speech recognition engine and hence cannot be directly used for resource deficient languages. In this paper, we propose a novel technique for sentence level alignment of long speech-text data by exploiting the syllable information in speech and text data. The proposed technique does not depend on the availability of any speech recognition models and hence can be used for resource deficient languages.


doi: 10.21437/Interspeech.2013-306

Cite as: Ahmed, I., Kopparapu, S.K. (2013) Technique for automatic sentence level alignment of long speech and transcripts. Proc. Interspeech 2013, 1516-1519, doi: 10.21437/Interspeech.2013-306

@inproceedings{ahmed13_interspeech,
  author={Imran Ahmed and Sunil Kumar Kopparapu},
  title={{Technique for automatic sentence level alignment of long speech and transcripts}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={1516--1519},
  doi={10.21437/Interspeech.2013-306}
}