An Industrial Strength Audio Search Algorithm
Date: Mon 28th June 2010 15:00
Location: Electronic Engineering, Room 105
Speaker(s): Avery Wang (Shazam Entertainment)

This is a reprise of the talk I gave at ISMIR 2003 on some of the key insights behind the Shazam service. The audio search algorithm is noise and distortion resistant, computationally efficient, and massively scalable, capable of quickly identifying a short segment of music captured through a mobile phone microphone in the presence of foreground voices and other dominant noise, and through voice codec compression, out of a database of several million tracks. The algorithm uses a combinatorially hashed time-frequency constellation analysis of the audio, yielding unusual properties such as transparency, in which multiple tracks mixed together may each be identified. Furthermore, for applications such as radio monitoring, search times on the order of a few milliseconds per query are attained, even on a massive music database.

Avery Wang has degrees in Mathematics and Electrical Engineering from Stanford University, specializing in digital signal processing algorithms. He wrote his dissertation on the auditory source separation problem at CCRMA under Julius Smith. He also spent two years at the Ruhr-Universität Bochum with Christof von der Malsburg at the Institut für Neuroinformatik on a Fulbright scholarship. He co-founded Shazam Entertainment in year 2000 and is the principal creator of the audio search technology.



Entered by: Mr Emmanouil Benetos 2010-11-03 13:48:31.190304