In this paper we present FiPPiE, a Filter-Inferred Pitch Posteriorgram Estimator – a method of estimating fundamental frequency from spectrograms, either linear or mel, by applying aspecial kind of filter in the spectral domain. Unlike other worksin this field, we developed a procedure for training an optimizedfilter (or kernel) for this type of estimation. FiPPiE, based onthis optimized filter, demonstrated itself as a reliable fundamental frequency estimator that is computationally efficient, differentiable, and easily implementable. We demonstrate the performance of the method both by the analysis of its behavior onhuman recordings, and by the stability analysis with help of anautomated system.
Cite as: Finkelstein, L., Chan, C.-a., Wan, V., Zen, H., Clark, R. (2023) FiPPiE: A Computationally Efficient Differentiable method for Estimating Fundamental Frequency From Spectrograms. Proc. 12th ISCA Speech Synthesis Workshop (SSW2023), 218-224, doi: 10.21437/SSW.2023-34
@inproceedings{finkelstein23_ssw, author={Lev Finkelstein and Chun-an Chan and Vincent Wan and Heiga Zen and Rob Clark}, title={{FiPPiE: A Computationally Efficient Differentiable method for Estimating Fundamental Frequency From Spectrograms}}, year=2023, booktitle={Proc. 12th ISCA Speech Synthesis Workshop (SSW2023)}, pages={218--224}, doi={10.21437/SSW.2023-34} }