The paper describes and compares two time domain algorithms for segmenting voiced speech into quasiperiodical units that correspond to pitch periods. The first algorithm uses the similarity of adjacent segments to build a graph that represents distances between them and finds the minimal path in it using a "greedy" algorithm. The second algorithm implements a set of heuristics that imitate actions of a human, who manually solves the problem. It starts from the middle of the voiced segment, finds the highest waveform peak, and then using an estimate of fundamental frequency period searches for peaks on the both sides. Some rules are applied to decide on the direction of processing, to prune errors at the beginning and at the end of the signal and to cope with jitter. The experimental results of the algorithms performance are presented. The algorithms application to precise pitch estimation is discussed.
Cite as: Petrushin, V.A. (2004) Adaptive algorithms for pitch-synchronous speech signal segmentation. Proc. 9th Conference on Speech and Computer (SPECOM 2004), 146-153
@inproceedings{petrushin04_specom, author={Valery A. Petrushin}, title={{Adaptive algorithms for pitch-synchronous speech signal segmentation}}, year=2004, booktitle={Proc. 9th Conference on Speech and Computer (SPECOM 2004)}, pages={146--153} }