Humans have an excellent ability to select a particular sound source from a noisy environment, called the ``Cocktail-Party Effect'' and to compensate for physically missing sound, called the ``Illusion of Continuity.'' This paper proposes a spectral peak tracker as a model of the illusion of continuity (or phonemic restoration) and a spectral sequence prediction method using a spectral peak tracker. Although some models have already been proposed, they treat only spectral peak frequencies and often generate wrong predicted spectra. We introduce a peak representation of log-spectrum with four parameters: amplitude, frequency, bandwidth, and asymmetry, using the spectral shape analysis method described by the wavelet transformation. And we devise a time-varying second-order system for formulating the trajectories of the parameters. We demonstrate that the model can estimate and track the parameters for connected vowels whose transition section has been partially replaced by white noise.
Cite as: Akagi, M., Iwaki, M., Sakaguchi, N. (1998) Spectral sequence compensation based on continuity of spectral sequence. Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998), paper 0028, doi: 10.21437/ICSLP.1998-315
@inproceedings{akagi98_icslp, author={Masato Akagi and Mamoru Iwaki and Noriyoshi Sakaguchi}, title={{Spectral sequence compensation based on continuity of spectral sequence}}, year=1998, booktitle={Proc. 5th International Conference on Spoken Language Processing (ICSLP 1998)}, pages={paper 0028}, doi={10.21437/ICSLP.1998-315} }