Home » Seen and Liked

Hannah Bast and Prefix Search: Keynote at the Dagstuhl Multimodal Music Processing Seminar

26 Januar 2011 5,473 views No Comment

Today (Wednesday) we saw a very inspiring keynote by Hannah Bast, Professor at the University of Freiburg. She’s an expert in text search and has an impressive history of employers, including Google and the Max Planck Institute for Informatics in Saarbrücken.

What she showed us was simply impressive engineering craft: prototype prefix search engines (going by the name of CompleteSearch) for the DBLP database and the whole of Wikipedia. I say prototype because, as I understand, they are research in progress, but the search had semantic capabilities that go far beyond what popular search engines offer, was fully functional, beautifully presented, and: Blazingly Fast. The combination of speed and semantic search was in fact Hannah’s main point. She explained that the first attempts to achieve this speed: beautiful, complex mathematical models, failed. They just did not perform well enough in practice. In contrast, the speed of the current system is based on “simply” reducing the amount of data (clever compression), and the number of random access operations (compact storage) that are needed to retrieve search results, but otherwise uses a very simple indexing model.

All the better that the system can deal with “semantic” search queries such as
play* scientist actor (find actors who played a scientist + evidence for this)
at essentially the same speed as with simple search queries. This appeared to work quite nicely on the complete Wikipedia data, it gets you the sort of results you’d expect.

Incidentally, there’s a way her system outperforms Google Scholar on the DBLP database without even needing to do deep semantic inferences. It finds my chord labelling paper when inputting the word “labeling” in American spelling with single “l”.

She also briefly touched upon music search and suggested a text-based encoding of music features, and in discussion we were playing with the idea of hybrid music search queries, for example: “look for the song that goes like this [sings melody], which I heard on the BBC last night”.

And she’s a nice person to talk to and play pool with, as we found out last night.

Comments are closed.