14thAnnual Conference of the International Speech Communication Association

Lyon, France
August 25-29, 2013

Controlling “Shout” Expression in a Japanese POP Singing Performance: Analysis and Suppression Study

Yuri Nishigaki (1), Ken-Ichi Sakakibara (2), Masanori Morise (3), Ryuichi Nisimura (1), Toshio Irino (1), Hideki Kawahara (1)

(1) Wakayama University, Japan
(2) Health Science University of Hokkaido, Japan
(3) University of Yamanashi, Japan

Degree of "shout" singing performance is effectively controlled by combining global spectral shape equalization, peak cancellation in frequency modulation spectrum of F0 trajectory, and synchronized shape-modulation of voice spectral envelope. This "shout-reduction" processing is based on a symmetry-based F0 extractor with fine temporal resolution, a temporally stable representation of instantaneous frequency of periodic signals, and the TANDEM-STRAIGHT, a speech analysis, modification and resynthesis framework. The proposed procedure successfully converted an expressive Japanese POP song performance with "shout" into a plain performance without damaging original naturalness. Possibility of adding artificial "shout" to plain performance is also discussed.

Full Paper

Bibliographic reference.  Nishigaki, Yuri / Sakakibara, Ken-Ichi / Morise, Masanori / Nisimura, Ryuichi / Irino, Toshio / Kawahara, Hideki (2013): "Controlling “shout” expression in a Japanese POP singing performance: analysis and suppression study", In INTERSPEECH-2013, 2905-2909.