visit at the Sagayama/Ono Lab
Katy Noland and I had the chance to visit the Sagayama & Ono Lab at the University of Tokyo today. We were very impressed by the research that was presented to us.
Sagayama-sensei showed us the new interface of the Orpheus online composition system, which is fun to play with, as I discovered just now. In fact, I composed my very own Let’s Go to Tokyo song. Well, not really a guaranteed hit yet, but nice indeed.
Ono-sensei gave us a brief introduction to his Harmonic-Percussive Sound Separation (HPSS) method, which we already knew, of course, but then he let his students take over and tell us about some nice applications of it. Yushi Ueda showed us his latest advances in chord recognition, and I was particularly interested to hear he’d used delta-chroma (i.e. the change of chroma) as an additional feature, and managed to improve his system using that. A creative use of HPSS is pursued by Hideyuki Tachibana. He made the observation that depending on the window length used in the Fourier analysis the human voice is more likely part of the percussive component or the harmonic component. This is because the human voice (and other frequency-fluctuating instruments) are somwhere in between the “vertical” percussive parts and the “horizontal” harmonic parts. Tachibana thus achieved better source separation results for voice. Miquel Espi also works on voice recognition, but for speech signals. He’s a new PhD student at the lab and focuses on finding a rich arsenal of audio features for successful vocal activity detection (VAD). VAD can be used for noise-reduction in phone calls and other application. Stanislaw Raczynski presented us the state of his thesis just before completion. His quest is quite remarkable: having started with nearly complete, but untractabe models of music he simplified his models to finally obtain a workable model for multi-pitch estimation from audio.
At the end I also had the chance to present my work. I briefly explained my NNLS approach, made some publicity for the release of the NNLS Chroma and Chordino plugins, and finally talked about my AIST work on lyrics-to-audio alignment. The guys also seemed to be impressed by the Song Prompter demo. All in all, a very nice afternoon. In the photo: front row: Miquel Espi and Katy Noland; back row: myself, Yushi Ueda, Stanislaw Raczynski and Nobutaka Ono.