A TFBSS framework for ego-noise reduction

Lin Wang, Andrea Cavallaro

Queen Mary University of London - Centre for Intelligent Sensing



Abstract

Acoustic sensing from a multi-rotor drone is heavily degraded by the strong ego-noise produced by the rotating motors and propellers. To address this problem, we propose a blind source separation (BSS) framework that extracts a target sound from noisy multi-channel signals captured by a microphone array mounted on a drone. The proposed method addresses the challenging problem of permutation alignment, in extremely low signal-to-noise-ratio scenarios (e.g. SNR smaller than -15 dB), by performing clustering on the time activities of the separated signals across frequencies. Since initialization plays an important role to the success of clustering, we propose a pre-processing algorithm which uses time-frequency spatial filtering (TFS) to generate a reference to pre-align the permutation. The pre-alignment not only improves the performance of clustering and permutation alignment, but also solves the target-channel selection problem for BSS. The proposed method integrates the advantages of both TFS and BSS. Experimental results with real-recorded data show that the proposed method is capable of processing the audio stream continuously in a blockwise manner and also remarkably outperforms the state-of-the-art.

Reference

L. Wang and A. Cavallaro, "A blind source separation framework for ego-noise reduction on multi-rotor drones," IEEE/ACM Transactions on Audio, Speech, and Language Processing, 58: 2523-2537, Aug. 2020. [pdf]

Audio and video demos


Scenario S1: continous processing (audio)

Original
Enhanced


Scenario S2: indoor environment

Original Enhanced


Scenario 3: outdoor environment


Original Enhanced


DREGON: moving drone

External video (http://dregon.inria.fr/datasets/dregon/)

Original audio
Enhanced audio