Adapting GCC-PHAT to Co-Prime Circular Microphone Arrays for Speech Direction of Arrival Estimation Using Neural Networks
Proceedings of 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2022
This paper investigates applying convolutional neural networks (CNNs) to co-prime circular microphone arrays (CPCMAs) for robustly estimating direction of arrival (DOA) of speech sources in noisy environments. Compared to conventional uniform circular arrays, the co-prime circular geometry improves the beampattern, array gain and accuracy of speech DOA estimation in adverse environments. DOA estimation using the generalised cross-correlation phase transform (GCC-PHAT) are commonly used for improving performance in noisy environments. However, large errors still occur in existing DOA estimation methods using CPCMA recordings under high noise, so the GCC-based feature has not fully exploited the benefit of CPCMA so far. The proposed algorithm enhances the CPCMA feature for training, before designing a CNN structure that reduces the number of parameters compared to existing work. Experimental results show that the CPCMA and CNN approach improves the speech DOA estimation accuracy across a range of noisy environments, particularly for highly noisy cases, with the CPCMA having the advantage of requiring significantly less microphones than a traditional uniform circular array (UCA).
Open Access Status
This publication is not available as open access