2 May 2014Time: 2:00 - 4:00pm
Venue: Eng. 2.09 Engineering Building, Queen Mary University of London, Mile End Road, London, E1 4NSJohanna Devaney and Michael Mandel, of Ohio State University, will present two seminars back-to-back, entitled "Analyzing recorded vocal performances" and "Strong models for understanding sounds in mixtures", respectively, in ENG 2.09 (the Engineering building) at Queen Mary University of London, Mile End Road, London E1 4NS.
First talk: Johanna Devaney, on Analyzing Recorded Vocal Performances
Abstract: A musical performance can convey both the musicians’ interpretation of the written musical score as well as emphasize, or even manipulate, the emotional content of the music through small variations in timing, dynamics, tuning, and timbre. This talk presents my work on score-guided automatic musical performance analysis, as well as my investigations into vocal intonation practices. The score-audio alignment algorithm I developed to estimate note locations makes use of a hybrid DTW-HMM multi-pass approach that is able to capture onset and offset asynchronies between simultaneously notated chords in polyphonic music. My work on vocal intonation practices has examined both solo and ensemble singing, with a particular focus on the role of musical training, the presence and/or type of accompaniment, and the organization of musical materials on intonation.
Bio: Johanna Devaney is an assistant professor of music theory and cognition at The Ohio State University. Her research applies a range of interdisciplinary approaches to the study of musical performance, motivated by a desire to understand how performers mediate listeners’ experience of music. Her work on extracting and analyzing performance data, with a particular focus on intonation in the singing voice, integrates the fields of music theory, music perception and cognition, signal processing, and machine learning. She has released a number of the tools she has developed in the open-source Automatic Music Performance and Comparison Toolkit (www.ampact.org). Johanna completed her PhD at the Schulich School of Music of McGill University. She also holds an M.Phil. degree from Columbia University, as well as an MA from York University in Toronto. Before working at Ohio State, she was a postdoctoral scholar at the Center for New Music and Audio Technologies (CNMAT) at the University of California, Berkeley.
Second talk: Michael Mandel, on Strong models for understanding sounds in mixtures
Abstract: Human abilities to understand sounds in mixtures, for example, speech in noise, far outstrip current automatic approaches, despite recent technological breakthroughs. This talk presents two projects that use strong models of speech to begin to close this gap and discusses their implications for musical applications. The first project investigates the human ability to understand speech in noise using a new data-driven paradigm. By formulating intelligibility prediction as a classification problem, the model is able to learn the important spectro-temporal features of speech utterances from the results of listening test using real speech. It is also able to successfully generalize to new recordings of the same and similar words. The second project aims to reconstruct damaged or obscured speech similarly to the way humans might, by using a strong prior model. In this case, the prior model is a full large vocabulary continuous speech recognizer. Posed as an optimization problem, this system finds the latent clean speech features that minimize a combination of the distance to the reliable regions of the noisy observation and the negative log likelihood under the recognizer. It reduces both speech recognition errors and the distance between the estimated speech and the original clean speech.
Bio: Michael I Mandel earned his BSc in Computer Science from the Massachusetts Institute of Technology in 2004 and his MS and PhD with distinction in Electrical Engineering from Columbia University in 2006 and 2010 as a Fu Foundation School of Engineering and Applied Sciences Presidential Scholar. From 2009 to 2010 he was an FQRNT Postdoctoral Research Fellow in the Machine Learning laboratory at the Université de Montréal. From 2010 to 2012 he was an Algorithm Developer at Audience Inc, a company that has shipped over 350 million noise suppression chips for cell phones. He is currently a Research Scientist in Computer Science and Engineering at the Ohio State University where he recently received an Outstanding Undergraduate Research Mentor award. His research applies signal processing and machine learning to computational audition problems including source separation, robust speech recognition, and music classification and tagging.