ISCA Archive Interspeech 2013
ISCA Archive Interspeech 2013

Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture

Milos Cernak, Xingyu Na, Philip N. Garner

Current HMM-based low bit rate speech coding systems work with phonetic vocoders. Pitch contour coding (on frame or phoneme level) is usually fairly orthogonal to other speech coding parameters. We make an assumption in our work that the speech signal contains supra-segmental cues. Hence, we present encoding of the pitch on the syllable level, used in the framework of a recognition/ synthesis speech coder with phonetic vocoder. The results imply that high accuracy pitch contour reconstruction with negligible speech quality degradation is possible. The proposed pitch encoding technique operates on 30.35 bits per second.


doi: 10.21437/Interspeech.2013-755

Cite as: Cernak, M., Na, X., Garner, P.N. (2013) Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture. Proc. Interspeech 2013, 3449-3452, doi: 10.21437/Interspeech.2013-755

@inproceedings{cernak13_interspeech,
  author={Milos Cernak and Xingyu Na and Philip N. Garner},
  title={{Syllable-based pitch encoding for low bit rate speech coding with recognition/synthesis architecture}},
  year=2013,
  booktitle={Proc. Interspeech 2013},
  pages={3449--3452},
  doi={10.21437/Interspeech.2013-755}
}