ISCA Archive SAPA 2004
ISCA Archive SAPA 2004

Modelling of note events for singing transcription

Matti P. Ryynänen, Anssi P. Klapuri

This paper concerns the automatic transcription of music and proposes a method for transcribing sung melodies. The method produces symbolic notations (i.e., MIDI files) from acoustic inputs based on two probabilistic models: a note event model and a musicological model. Note events are described with a hidden Markov model (HMM) using four musical features: pitch, voicing, accent, and metrical accent. The model uses these features to calculate the likelihoods of different notes and performs note segmentation. The musicological model applies key estimation and the likelihoods of two-note and three-note sequences to determine transition likelihoods between different note events. These two models form a melody transcription system with a modular architecture which can be extended with desired front-end feature extractors and musicological rules. The system transcribes correctly over 90 % of notes, thus halving the amount of errors compared to a simple rounding of pitch estimates to the nearest MIDI note.

Cite as: Ryynänen, M.P., Klapuri, A.P. (2004) Modelling of note events for singing transcription. Proc. ITRW on Statistical and Perceptual Audio Processing (SAPA 2004), paper 40

  author={Matti P. Ryynänen and Anssi P. Klapuri},
  title={{Modelling of note events for singing transcription}},
  booktitle={Proc. ITRW on Statistical and Perceptual Audio Processing (SAPA 2004)},
  pages={paper 40}