Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

First Approach to the Selection of Lexical Units for Continuous Speech Recognition of Basque

Miren Karmele López de Ipiña (1), In Torres (2), Lourdes Oñederra (3), Amparo Varona (2), N. Ezeiza (2), M. Peñagarikano (2), M. Hernandez (4), Luis Javier Rodriguez (2)

(1) Sistemen Ingeniaritza eta Automatika Saila Gasteiz, (2) Elektrika eta Elektronika Saila, Bilbo, (3) Euskal Filologi Saila. Gasteiz, (4) Konputazio Zientziak eta Adimen Artifiziala, Donostia, University of the Basque Country, Spain

The selection of appropriated Lexical Units is an important issue in the Language Model (LM) generation. Word has been used classically as unit in most of the Continuous Speech Recognition systems. However, during the last years proposals of non-word units have begun to appear. Since Basque is an agglutinative language with a certain structure inside the word, the nonword units could be an adequate option. In this work, a statistical analysis of the morphological structure of Basque has been carried out. This analysis shows a slight increment of the rates of confusion in Continuous Speech Recognition Systems due to the great increment of acoustically similar and short units. Finally several proposals of Lexical Units are analyzed to deal with the problem.

Full Paper

Bibliographic reference.  López_de_Ipiña, Miren Karmele / Torres, In / Oñederra, Lourdes / Varona, Amparo / Ezeiza, N. / Peñagarikano, M. / Hernandez, M. / Rodriguez, Luis Javier (2000): "First approach to the selection of lexical units for continuous speech recognition of Basque", In ICSLP-2000, vol.2, 531-534.