6th International Conference
on Digital Audio Effects

8-11 September, 2003
Queen Mary, University of London

| Home | Call for Papers | Submission | Registration | Committee |
| Tutorials | Program | Concert | Venue | Accommodation | Links | Contact |

Tutorials (Mon 08 Sept)

See also: Tutorial Slides

The tutorials cost £30 each, or £50 for both. The price includes light refreshments (not lunch) and printed notes.

Cultural and Acoustic Representations for Music Retrieval and

Brian Whitman, Massachusetts Institute of Technology

Music retrieval systems such as recommenders, genre classifiers and
playlist organizers benefit from a clear view of the acoustic content.
But music carries with it an important cultural and linguistic
accessory-- 'Rock' is as much a culturally-defined tag as it is an
acoustic one. We will discuss various acoustic feature extraction
techniques for representing musical salience in an audio signal as well
as data mining and natural language processing techniques necessary to
extract the 'cultural aboutness' of musical artists. We then go over
recent developments in machine learning techniques necessary to learn
relations in order to provide the end user with some notion of music
intelligence. Lastly, we will go over mechanisms to ground knowledge of
one domain in the other: can we separate audio from audience? Is this
month's top 40 defined culturally or acoustically? Can we learn an
automatic description of audio?

The tutorial will cover the following topics:

+ Music retrieval space: who's doing what and why
+ Musical feature extraction: signal entropy, onset detection,
FFT/STFT, PSD, const-Q FFT, MFCC, autocorrelation
+ Feature post-processing and separation: PCA, NMF, time warping
+ Cultural feature extraction: description, web mining, words as
features, tagging, chunking, TF-IDF, peer-to-peer crawling
+ Machine learning techniques: SVM / RLSC, HMM
+ Combining cultural/acoustic features
+ Grounding words in music
+ Evaluation: surveys, experts, ground truth observation
+ Demo: music similarity using genre anchors vs. grounded anchors
+ Demo: Query by description

Sound replacement, beat unmixing and audio mosaics: content-based audio processing with MPEG-7

Michael Casey (City University London)
Adam Lindsay (Lancaster University) [MPEG-7 Co-editors]

Abstract: this tutorial presents new audio processing techniques made
possible using the MPEG-7 international standard. The first part of the
tutorial will give an overview of MPEG-7 descriptors, audio extraction
algorithms and efficient metadata storage and retrieval. The second part
will give detailed examples of applying the standard to music processing
such as beat unmixing, score alignment, sound replacement and audio mosiacs.

Overview of Low-level audio descriptors
AudioSpectrumEnvelope (Constant Q Spectrum)
AudioSpectrumBasis (SVD/ICA/ISA) AudioSpectrumProjection
AudioHarmonicity AudioSpectrumCentroid AudioSpectrumVariation
Music Descriptors
Melody Rhythm Timbre
High-level description schemes
SoundModel SoundClassificationModel
Similarity Matrices and Audio structure
Pattern Discovery Algorithms
XML Parsers and databases
Processing with XPath and XSLT
MPEG-7 processing in C++/JAVA
Multimedia processing: audio, video, images

Last updated 10/06/2003 11:23 by Mark Plumbley