Doctor of Philosopy
School of Electrical, Computer and Telecommunications Engineering - Faculty of Engineering
Russell, Iain Trent, Developing a subband model for blind signal separation in an acoustic environment, PhD thesis, School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, 2005. http://ro.uow.edu.au/theses/553
The focus of this thesis is to develop a framework for solving convolutively mixed blind signal separation problems in the subband domain. Current methods generally employ a discrete Fourier transform (DFT) to change the time domain convolutive model into many instantaneous multiplicative models to save on computations and convergence time. The motivation for approaching the problem from the subband domain is that there is an upper bound on the quality of separation for frequency domain methods where the mixing is done in a reverberant environment and there is a high number of unknown variables to solve for. This is shown with reference to the works in (S. Araki, S. Makino, T. Nishikawa, and H. Saruwatari, 2001; M. Ikram, and D. Morgan, 2000; R. Mukai, S. Araki, H. Sawada, and S. Makino, 2004). The model is developed throughout the thesis in a series of stages. Firstly we investigate modelling the convolutive Blind Signal Separation (BSS) problem completely in the time domain. The benefit of this is that by not performing any transforms we eliminate the local frequency permutation problem that is inherent in all convolutive BSS problems. To solve the permutation problem requires additional computational overhead. There is a tradeoff however according to how complex the mixing/demixing system is. The longer the reverberation time of an acoustic environment, the more unknown variables must be solved. The savings of performing multiplication in the frequency domain as opposed to convolution in the time domain must be compared to the savings of not doing the transform operator twice, as well as ensuring the local Vll Abstract permutation problem is solved.
Two new algorithms that avoid the local permutation problem are proposed and investigated. The first uses an alternating least squares approach (ALS) while the second uses joint diagonalization of output correlation matrices of the recovered signals. Where it is plausible to assume that we have some sort of a priori information that provides a good initial starting point for the unknown demixing system, then we only need to consider some type of local optimization procedure to solve the unknown demixing system. Two local optimization procedures investigated include the steepest gradient descent and Newton methods. Both types of local solvers were compared and the merits and disadvantages of each are specified in regards to the convolutive BSS time domain algorithm proposed. Where small convolutive mixing systems exist, such as in wireless communication mixing systems that assume a two ray model, the computational overhead that is increased by doing convolution in the time domain is offset more by the savings of not havmg to solve the local permutation problem and execute the transform operation.
In some cases, information pertaining to problem is unavailable. Geometric source separation assumes that there is some additional knowledge about the layout of the sensors WIth spatial reference to the source positions. This allows an angle of incidence of the sound wave impinging on the sensor array to either be known directly or calculated using various beamforming techniques. If we cannot assume to know such information, then multivariate complex problems with a high number of parameters become harder to solve for without getting spurious results from ill-convergence to local multiminima as opposed to the preferred global minima which corresponds to the desired demixing system that will allow signal separation. To avoid this, we integrate one of the proposed time domain convolutive BSS algorithms with a global optimization routine that is catered to suit the BSS convolutive problem model. A Abstract ix branch and bound algorithm that uses division by hyper-rectangles is used to solve the uninitialized optimization BSS problem. With the validity of the proposed BSS time domain convolutive algorithm and the global optimization approach being justified, attention will then be focused on integrating these contributions into a model which uses subband decomposition before performing signal separation.
Various methods of subband decomposition are considered including using a uniform FIR analysis/synthesis filter bank based on DFT modulation as well as cosine modulation. The prototype window used is based on an extended lapped transform and was chosen due to the computational benefits of using lapped transforms. A framework for developing such a subband model is made with the main aspects of the model being the BSS algorithm and optimization approaches used, the way in which the observed signals from a multiple-input-multiple-output (MIMO) mixing system are decomposed via a filter bank, and the way in which the local permutation problem is overcome. In our work we propose a new subband detection, correction, and sorting routine for separated but arbitrarily permuted subbands over the entire spectrum.
Finally, a general and systematic approach for obtaining experimental measurements for generating the impulse response of an acoustic environment such as a typical office room, as well as the inverting MIMO system using wiener-hopf and optImal filtering theory is presented to allow full availability of information for the problem modelled in a practical environment as opposed to synthetic testing methods which are also examined.
02Whole.pdf (6950 kB)