Third International Conference on Spoken Language Processing (ICSLP 94)
This paper proposes a new variable bit-rate PSI-CELP speech coding method that switches four modes roughly corresponding to silence, unvoiced regions, voiced transient regions, and voiced stationary regions. The coder modes are determined by an open-loop procedure every two subframes (20 ms) using the feature parameters extracted from the input speech. The proposed method is based on the algorithms used in the PDC half-rate standard, but the LSP coder uses inter-frame predictive vector quantization every two subframes instead of matrix quantization every four subframes, and pitch parameters are coded by using finite-state intersubframe prediction in voiced stationary regions to achieve good quality with fewer bits. We examined the performance using Japanese speech data which was approximately 83% active. A listening test using non-specialist subjects showed that the proposed method achieves much better quality at average bit-rate of 2.53 kbit/s over all speech data or at an average bit-rate of 2.88 kbit/s without silence than the fixed bit-rate PDC half-rate standard (3.45 kbit/s). Even when the input speech was noisy, the proposed method still achieved better quality with fewer bits than the PDC standard. This method considerably reduces the bit-rate not only in silence and unvoiced regions but also in voiced regions, so it achieves high-quality variable-low-bit-rate speech coding.
Bibliographic reference. Ohmuro, Hitoshi / Mano, Kazunori / Moriya, Takehiro (1994): "Variable bit-rate speech coding based on PSI-CELP", In ICSLP-1994, 2067-2070.