Robust speaker DOA estimation based on the inter-sensor data ratio model and binary mask estimation in the bispectrum domain
RIS ID
115517
Abstract
When noise is directional instead of diffuse, the majority of conventional direction of arrival (DOA) estimation techniques suffer from performance degradation because of mismatched noise models. In this paper, a novel robust DOA estimation algorithm is developed as an initial investigation into DOA estimation of speech under directional non-speech interference (DNSI) and non-directional background noise (NDBN) using an acoustic vector sensor (AVS), a compact co-incident microphone array. Specifically, by defining an intersensor data ratio model in the bispectrum domain (BISDR), the relationship between the BISDR and the speech DOA cues are derived. By recursively estimating a priori local signal-to-interference ratio of the bispectrum (B-PriLSIR), a robust speech-dominated binary mask (SDBM) is estimated and thus the speech DOA cue is faithfully extracted. Experimental results with simulated and recorded data demonstrate that the proposed algorithm offers high DOA estimation accuracy for all angles and is robust against DNSI and NDBN.
Publication Details
Jin, Y., Zou, Y. & Ritz, C. (2017). Robust speaker DOA estimation based on the inter-sensor data ratio model and binary mask estimation in the bispectrum domain. 2017 IEEE International Conference on Acoustics, Speech, and Signal Processing (pp. 3266-3270). United States: IEEE.