Year

2000

Degree Name

Doctor of Philosophy

Department

School of Electrical, Computer and Telecommunications Engineering

Abstract

Speech coding has experienced rapid growth throughout the past decade as many new desirable commercial applications have emerged. However, there remains a void of solutions for good synthesised speech at bit rates around 4kbit/s. At this transmission rate, the limitations of both waveform coders, which produce excellent quality at higher rates, and parametric coders, which operate well at lower rates, inhibit their performance quality. In this thesis, several techniques that provide improved signal analysis and bridge the gap betv/een waveform and parametric coders, are proposed. Firstly, basic Waveform Interpolation (WI) principles are considered. These take advantage of the pitch periodicity and perceptual redundancies of speech. The decomposition of WI, which separates voiced and unvoiced characteristics, is an advantageous mechanism to exploit perceptual differences. This concept is thus extended to provide a multi-resolution analysis of speech evolution by implementing perfect reconstruction wavelet filter banks. Several causal, stable, finite impulse response (FIR) and infinite impulse response (IIR) filter bank designs are discussed. These are adapted to the signal properties by drawing upon closely related wavelet theory. The proposed wavelet decomposition allows the application of flexible, efficient, perception-based quantisation techniques to code the excitation signal.

Share

COinS
 

Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.