This paper investigates the use of an Acoustic Vector Sensor (AVS) for tracking a moving speaker in real time through estimation of the Direction of Arrival (DOA). This estimation is obtained using the MUltiple SIgnal Classification (MUSIC) [1] algorithm applied on a time-frame basis. The performance of the AVS is compared with a SoundField Microphone which has similar polar responses to the AVS using time-frames ranging from 20 ms to 1 s. Results show that for 20 ms frames, the AVS is capable of estimating the DOA for both mono-tone and speech signals, which are both stationary and moving, with an accuracy of approximately 1.60 and less than 50 in azimuth, for stationary and moving speech sources, respectively. The results also show that the DOA estimates using the SoundField microphone are significantly less accurate than those obtained from the AVS. Furthermore, the results suggest that for estimating the DOA for speech sources, a Voice Activity Detector (VAD) is critical to ensure accurate azimuth estimation.
History
Citation
M. Shujau, C. H. Ritz & I. S. Burnett, "Using in-air acoustic vector sensors for tracking moving speakers," in International Conference on Signal Processing and Communication Systems, 2010, pp. 1-5.
Parent title
4th International Conference on Signal Processing and Communication Systems, ICSPCS'2010 - Proceedings