We present a method for morphing between smooth spectral magnitude envelopes of speech. An important element of our method is the notion of audio flow, which is inspired by similar notions of optical flow computed between images in computer vision applications. Audio flow defines the correspondence between two smooth spectral magnitude envelopes, and encodes the formant shifting that occurs from one sound to another. We present several algorithms for the automatic computation of audio flow from a small 20 second corpus of speech. In addition, we present an algorithm for morphing smoothly between any two spectral magnitude envelopes, given the computed audio flow between them.
Cite as: Ezzat, T., Meyers, E., Glass, J., Poggio, T. (2005) Morphing spectral envelopes using audio flow. Proc. Interspeech 2005, 2545-2548, doi: 10.21437/Interspeech.2005-791
@inproceedings{ezzat05_interspeech, author={Tony Ezzat and Ethan Meyers and James Glass and Tomaso Poggio}, title={{Morphing spectral envelopes using audio flow}}, year=2005, booktitle={Proc. Interspeech 2005}, pages={2545--2548}, doi={10.21437/Interspeech.2005-791} }