We propose a unified framework for segment quantization of speech at ultra low bit-rates of 150 bits/sec based on unit-selection principle using a modified one-pass dynamic programming algorithm. The algorithm handles both fixed- and variable- length units in a unified manner, thereby providing a generalization over two existing unit selection methods, which deal with single-frame and segmental units differently. The proposed algorithm performs unit-selection based quantization directly on the units of a continuous codebook, thereby not incurring any of the sub-optimalities of the existing segmental algorithm. Moreover, the existing single-frame algorithm becomes a special case of the proposed algorithm. Based on the rate-distortion performance on a multi-speaker database, we show that fixed length units of 6-8 frames perform significantly better than single-frame units and offer similar spectral distortions as variable-length phonetic units, thereby circumventing expensive segmentation and labeling of a continuous database for unit selection based low bit-rate coding.
Cite as: Ramasubramanian, V., Harish, D. (2006) An unified unit-selection framework for ultra low bit-rate speech coding. Proc. Interspeech 2006, paper 2028-Mon1FoP.4, doi: 10.21437/Interspeech.2006-61
@inproceedings{ramasubramanian06_interspeech, author={V. Ramasubramanian and D. Harish}, title={{An unified unit-selection framework for ultra low bit-rate speech coding}}, year=2006, booktitle={Proc. Interspeech 2006}, pages={paper 2028-Mon1FoP.4}, doi={10.21437/Interspeech.2006-61} }