Degree Name

Doctor of Philosophy


School of Electrical, Computer and Telecommunications Engineering


Machine learning has been extensively investigated over the last three decades for its capability to learn mapping of functions from patterns. Nowadays, machine learning is routinely used in systems ranging from decision making to pattern recognition. However, with the emergence of ever-growing amount of information, machine learning techniques, such as neural networks and support vector machines, become unpractical or inefficient to solve large-scale problems. The main research of this thesis therefore focuses on analyzing the sparse signal representation algorithms and their application to machine learning.

Three major problems of machine learning are addressed in this thesis: structural (model) optimization, supervised training and feature selection. Avariety of sparsity-based algorithms are proposed to tackle these problems. The proposed algorithms are capable of representing salient information using few elements, thereby saving memory storage and enhancing the generalization ability of the machine learning algorithms.

The first contribution is to optimize the structure of neural networks by finding a sparse representation for the network architecture. The proposed method starts with a very large network, and then a forward selection criterion is derived to identify important network parameters that minimize the residual output error. One advantage is that the algorithm makes no use of the problem-dependent parameters, nor does it impose constraints on the network type.

The second contribution is the sparsity-based network training algorithmthat extends the structural optimization method. The proposed training algorithm consists of two sub procedures: architecture optimization using sparse representation and weight update using dictionary learning. As a result, the algorithm is capable of training the network and optimizing the architecture simultaneously.

The sparsity-based training strategy is then applied to least squares support vector machine (LS-SVM). The model of LS-SVM is first reformulated as a sparse structure, and then the sparse representation is employed to reconstruct the learning model. The main advantage is that the proposed algorithm iteratively builds up the compact topology, while maintaining the training accuracy from the original large architecture.

The last contribution is to address the problem of dimensionality reduction via sparse signal representation. The presented algorithm regards the original feature set as the basis function, and selects discriminative features thatminimize the residual error. The selected features have a direct correspondence to the performance requirement of the given problem. Experimentally, the proposed algorithm is tested with several benchmark classification tasks and a pedestrian detection problem. The results show that the sparsity-based algorithm achieves good classification accuracy and is also robust against variations in the number of training patterns.