We describe a perceptual space for timbre, define an objective metric that takes into account perceptual orthogonality and measure the quality of timbre interpolation. We discuss two timbre representations and measure perceptual judgments. We determine that a timbre space based on Mel-frequency cepstral coefficients (MFCC) is a good model for perceptual timbre space.
Cite as: Terasawa, H., Slaney, M., Berger, J. (2005) A timbre space for speech. Proc. Interspeech 2005, 1729-1732, doi: 10.21437/Interspeech.2005-285
@inproceedings{terasawa05_interspeech, author={Hiroko Terasawa and Malcolm Slaney and Jonathan Berger}, title={{A timbre space for speech}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={1729--1732}, doi={10.21437/Interspeech.2005-285} }