Sixth International Conference on Spoken Language Processing
(ICSLP 2000)

Beijing, China
October 16-20, 2000

Extracting Phonological Chunks Based on Piecewise Linear Segment Lattices

Hiroaki Kojima, Kazuyo Tanaka

Electrotechnical Laboratory, AIST, MITI, Tsukuba, Ibaraki, Japan

The task of our research is to form phone-like models and a phoneme-like set from spoken word samples without using any transcriptions except for the lexical identification of each word in a vocabulary. This framework is derived from two motivations: 1) automatic design of optimal speech recognition units and structures of phone models, and 2) multi-lingual speech recognition based on language-independent intermediate phonetic codes. The procedure consists of two steps: 1) constructing a VQ codebook of sub-phonetic segments from speech samples, and 2) extracting phonological chunks from sequences of the codes. Segment model is represented with "piecewise linear segment lattice" model, which is a lattice structure of segments, each of which is represented as regression coefficients of feature vectors within the segment. Phonological chunks are extracted with a criterion based on Kullback- Leibler divergence between the distribution of individual VQ codes. The recognition rate yields approximately 90% on the 1542 words task with 128 VQ codes.

Full Paper

Bibliographic reference.  Kojima, Hiroaki / Tanaka, Kazuyo (2000): "Extracting phonological chunks based on piecewise linear segment lattices", In ICSLP-2000, vol.2, 959-962.