Podcastle: automatic speech recognition, and you!
In the photo: Masataka Goto (right) explaining the speech-recognition web service Podcastle to Katy Noland.
I can’t hide being impressed by Podcastle (http://podcastle.jp/). It’s a web-crawling speech recognizer that munches whole podcast series and provides you with a transcription. Now, having tried the speech recogntion service on YouTube, we all know that we’re still miles away from perfect speech recognition, but uniquely, Podcastle addresses that, letting users correct the transcriptions in a very intuitive manner. Before you get too excited, let me tell you that this service is only available for Japanese language podcasts. However, the Podcastle website is available in English, so I tried it a bit (do try it too!) and was quite surprised by how responsive it was. My very sparse knowledge of Japanese even allowed me to correct a word or two. In any case, this collaborative system seems to work really well for popular podcasts: apparently, the fans of these podcasts love correcting the transcriptions (some episodes have more than 2000 corrections!). For the machine learning geeks among you there’s another ultimately cool thing about Podcastle: it actually uses the corrections to improve the underlying speech recogniser, and re-estimates the remaining episodes within a podcast based on the new model (except for places where they already have been corrected).
Why I tell you all about this now? Well, of course, I knew about Podcastle, since my colleague Jun is heavily involved. I’d even read a paper about the technical bits. But I realised how cool it was only last Friday, when Katy Noland visited our lab and Masataka Goto gave her an introduction to Podcastle. Well, I really hope that Jun and Masataka will eventually publish a version for podcasts in other languages: why not German! Maybe we could call it PodBurg…
You can find the latest paper on Podcastle here.