University of Wollongong Thesis Collection 2017+

Single and Multichannel Speech Source Separation using Non- Negative Matrix Factorisation Incorporating Spectral Masks

Yuxiao Feng, University of Wollongong

Year

2017

Degree Name

Master of Philosophy

Department

School of Electrical, Computer and Telecommunications Engineering

Abstract

The problem of separating mixtures of speech signals has always been a heated topic in speech processing. Multiple speech separation approaches have been proposed and a successful separation system benefits numerous applications, such as hands-free communication systems. However, separation performance of existing techniques is still unsatisfactory in terms of both speech quality and speech intelligibility. Recently, data driven approaches to solving speech signal processing problems, where information learnt from example databases of speech recordings is used to derive new signal processing algorithms has shown significant success. Consequently, this thesis investigates one of the data-driven models for speech separation, namely non-negative matrix factorization (NMF) and relevant methods, with the expectation of achieving increased speech quality and speech intelligibility of separated speech sources compared to existing approaches. Specifically, Chapter 3 proposes an NMF approach modified with spectral magnitude masks typically derived for single-channel speech separation. Chapter 4 then proposes an enhanced NMF approach that utilises estimated direction-of-arrival information to realize multi-channel speech separation. Compared with corresponding baseline methods, the proposed approaches demonstrate improvements in speech quality and intelligibility metrics, which verifies the success of the proposed approaches in this thesis.

Recommended Citation

Feng, Yuxiao, Single and Multichannel Speech Source Separation using Non- Negative Matrix Factorisation Incorporating Spectral Masks, Master of Philosophy thesis, School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, 2017. https://ro.uow.edu.au/theses1/90

FoR codes (2008)

090609 Signal Processing, 090699 Electrical and Electronic Engineering not elsewhere classified

Download

COinS

Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.

University of Wollongong Thesis Collection 2017+

Single and Multichannel Speech Source Separation using Non- Negative Matrix Factorisation Incorporating Spectral Masks

Year

Degree Name

Department

Abstract

Recommended Citation

FoR codes (2008)

Search

Browse

Links

University of Wollongong Thesis Collection 2017+

Single and Multichannel Speech Source Separation using Non- Negative Matrix Factorisation Incorporating Spectral Masks

Author

Year

Degree Name

Department

Abstract

Recommended Citation

FoR codes (2008)

Share

Search

Browse

Links