EUROSPEECH 2003 - INTERSPEECH 2003
A singing transcription system which transcribes human singing voice to musical notes is described in this paper. The fact that human singing rarely follows standard musical scale makes it a challenge to implement such a system. This system utilizes some new methods to deal with the issue of imprecise musical scale of input voice of a human singer, such as series spectral standard deviation used for note segmentation, series Adaptive Round Semitone used for melody tracking and series Tune Map acting as a musical grammar constraint in melody tracking. Furthermore, a large vocabulary series speech recognizer performing the lyric recognition tasks is also added, which is a new trial in a singing transcription system.
Bibliographic reference. Wang, Chong-kai / Lyu, Ren-Yuan / Chiang, Yuang-Chin (2003): "An automatic singing transcription system with multilingual singing lyric recognizer and robust melody tracker", In EUROSPEECH-2003, 1197-1200.