Home » Publication, Thesis

Automatic Chord Transcription Using Computational Models of Musical Context

PDF BIBTEX   6 August 2010 8,941 views 3 Comments
Publication authored by Mauch, Matthias.

Abstract. This thesis is concerned with the automatic transcription of chords from audio, with an emphasis on modern popular music. Musical context such as the key and the structural segmentation aid the interpretation of chords in human beings. In this thesis we propose computational models that integrate such musical context into the automatic chord estimation process. We present a novel dynamic Bayesian network (DBN) which integrates models of metric position, key, chord, bass note and two beat-synchronous audio features (bass and treble chroma) into a single high-level musical context model. We simultaneously infer the most probable sequence of metric positions, keys, chords and bass notes via Viterbi inference. Several experiments with real world data show that adding context parameters results in a significant increase in chord recognition accuracy and faithfulness of chord segmentation. The proposed, most complex method transcribes chords with a state-of-the-art accuracy of 73% on the song collection used for the 2009 MIREX Chord Detection tasks. This method is used as a baseline method for two further enhancements. Firstly, we aim to improve chord confusion behaviour by modifying the audio front end processing. We compare the effect of learning chord profiles as Gaussian mixtures to the effect of using chromagrams generated from an approximate pitch transcription method. We show that using chromagrams from approximate transcription results in the most substantial increase in accuracy. The best method achieves 79% accuracy and significantly outperforms the state of the art.Secondly, we propose a method by which chromagram information is shared between repeated structural segments (such as verses) in a song. This can be done fully automatically using a novel structural segmentation algorithm tailored to this task. We show that the technique leads to a significant increase in accuracy and readability. The segmentation algorithm itself also obtains state-of-the-art results. A method that combines both of the above enhancements reaches an accuracy of 81%, a statistically significant improvement over the best result (74%) in the 2009 MIREX Chord Detection tasks.
Author = {Matthias Mauch},
School = {Queen Mary University of London},
Title = {Automatic Chord Transcription from Audio Using Computational Models of Musical Context},
Year = {2010}}


  • artificial musicality » Blog Archive » Recent MIR theses on chords, keys and harmony said:

    [...] Matthias Mauch: Automatic chord transcription using computational models of musical context on this website [...]

  • Jerico Dela Cruz said:

    Sir Just wanna ask if you think there’s still a way to improve that 81%? Or is it really on its limit?
    Because I also would like to propose a thesis for audio file to guitar chord. thanks

  • Matthias Mauch (author) said:

    Hi Jerico, Yes, I think it is possible to go beyond what’s currently possible. In fact some people have already done so. My own current project aims at integrating the human being in the process of computerised music analysis—and I hope there’s something in that, for example. If you want to discuss some more, drop me an email, and maybe we can have a chat.