ISCA Archive ICSLP 2000
ISCA Archive ICSLP 2000

Extracting phonological chunks based on piecewise linear segment lattices

Hiroaki Kojima, Kazuyo Tanaka

The task of our research is to form phone-like models and a phoneme-like set from spoken word samples without using any transcriptions except for the lexical identification of each word in a vocabulary. This framework is derived from two motivations: 1) automatic design of optimal speech recognition units and structures of phone models, and 2) multi-lingual speech recognition based on language-independent intermediate phonetic codes. The procedure consists of two steps: 1) constructing a VQ codebook of sub-phonetic segments from speech samples, and 2) extracting phonological chunks from sequences of the codes. Segment model is represented with "piecewise linear segment lattice" model, which is a lattice structure of segments, each of which is represented as regression coefficients of feature vectors within the segment. Phonological chunks are extracted with a criterion based on Kullback- Leibler divergence between the distribution of individual VQ codes. The recognition rate yields approximately 90% on the 1542 words task with 128 VQ codes.


Cite as: Kojima, H., Tanaka, K. (2000) Extracting phonological chunks based on piecewise linear segment lattices. Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000), vol. 2, 959-962

@inproceedings{kojima00_icslp,
  author={Hiroaki Kojima and Kazuyo Tanaka},
  title={{Extracting phonological chunks based on piecewise linear segment lattices}},
  year=2000,
  booktitle={Proc. 6th International Conference on Spoken Language Processing (ICSLP 2000)},
  pages={vol. 2, 959-962}
}