University of Wollongong
Browse

Time delay estimation of reverberant meeting speech: on the use of multichannel linear prediction

Download (384.16 kB)
conference contribution
posted on 2024-11-14, 10:30 authored by Eva Cheng, Ian Burnett, Christian RitzChristian Ritz
Effective and efficient access to multiparty meeting recordings requires techniques for meeting analysis and indexing. Since meeting participants are generally stationary, speaker location information may be used to identify meeting events e.g., detect speaker changes. Time-delay estimation (TDE) utilizing cross-correlation of multichannel speech recordings is a common approach for deriving speech source location information. Research improved TDE by calculating TDE from linear prediction (LP) residual signals obtained from LP analysis on each individual speech channel. This paper investigates the use of LP residuals for speech TDE, where the residuals are obtained from jointly modeling the multiple speech channels. Experiments conducted with a simulated reverberant room and real room recordings show that jointly modeled LP better predicts the LP coefficients, compared to LP applied to individual channels. Both the individually and jointly modeled LP exhibit similar TDE performance, and outperform TDE on the speech alone, especially with the real recordings.

History

Citation

E. Cheng, I. S. Burnett & C. H. Ritz, "Time delay estimation of reverberant meeting speech: on the use of multichannel linear prediction", in International Conference on Signal Image Technology & Internet Based Systems (SITIS '07), 2007, pp. 494-500.

Parent title

Proceedings - International Conference on Signal Image Technologies and Internet Based Systems, SITIS 2007

Pagination

531-537

Language

English

RIS ID

22872

Usage metrics

    Categories

    Keywords

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC