14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Technique for Automatic Sentence Level Alignment of Long Speech and Transcripts

Imran Ahmed, Sunil Kumar Kopparapu

TCS Innovation Labs, India

A frugal approach to construct speech corpora, specially for resource deficient languages, is to exploit collections of speech and corresponding text data available in audio books, news, lectures. However, using these resources for building speech corpora require an alignment of the long duration speech data with the accompanying text data. Existing techniques for automatic speech-text alignment of long audio files assume availability of a basic speech recognition engine and hence cannot be directly used for resource deficient languages. In this paper, we propose a novel technique for sentence level alignment of long speech-text data by exploiting the syllable information in speech and text data. The proposed technique does not depend on the availability of any speech recognition models and hence can be used for resource deficient languages.

Full Paper

Bibliographic reference.  Ahmed, Imran / Kopparapu, Sunil Kumar (2013): "Technique for automatic sentence level alignment of long speech and transcripts", In INTERSPEECH-2013, 1516-1519.