5th International Conference on Spoken Language Processing
With the advent of spoken language computer interface systems, the storage and management of speech corpora is becoming more of an issue in the development of such systems. Until recently, even large corpora were stored as individual text and speech files, or as a single, monolithic file. The issues involved in management and retrieval of the data have been, to a large extent, overlooked. Relational database management systems (RDBMS) are proposed as an ideal tool for the management of speech corpora. Relationships between words and phonemes, and the realisations of these, can be stored and retrieved efficiently. RDBMS may be constructed with various levels, to store speaker, language, label transcription, and phonetic information, plus speech as isolated words, and derived segmented units. An implementation of such a system is presented to manage the Otago Speech Corpus, currently called Management Of Otago Speech Environment (MOOSE). The ability of the MOOSE to be applied to other corpora is currently under investigation.
Bibliographic reference. Laws, Mark / Kilgour, Richard (1998): "MOOSE: management of otago speech environment", In ICSLP-1998, paper 1116.