11th Annual Conference of the International Speech Communication Association

Makuhari, Chiba, Japan
September 26-30. 2010

A Statistical Segment-Based Approach for Spoken Language Understanding

Lucía Ortega, Isabel Galiano, Lluís-F. Hurtado, Emilio Sanchis, Encarna Segarra

Universidad Politécnica de Valencia, Spain

In this paper we propose an algorithm to learn statistical language understanding models from a corpus of unaligned pairs of word sentences and their corresponding semantic frames. Specifically, it allows to automatically map variable-length word segments with their corresponding semantic labels and thus, the decoding of user utterances to their corresponding meanings. In this way we avoid the time consuming work of manually associate semantic tags to words. We use the algorithm to learn the understanding component of a Spoken Dialog System for railway information retrieval in Spanish. Experiments show that the results obtained with the proposed method are very promising, whereas the effort employed to obtain the models is not comparable with this of manually segment the training corpus.

Full Paper

Bibliographic reference.  Ortega, Lucía / Galiano, Isabel / Hurtado, Lluís-F. / Sanchis, Emilio / Segarra, Encarna (2010): "A statistical segment-based approach for spoken language understanding", In INTERSPEECH-2010, 1836-1839.