Publication Details

This article originally appeared as: Tivive, FHC and Bouzerdoum, A, Efficient training algorithms for a class of shunting inhibitory convolutional neural networks, IEEE Transactions on Neural Networks, May 2005, 16(3), 541-556. Copyright IEEE 2005.


This article presents some efficient training algorithms, based on first-order, second-order, and conjugate gradient optimization methods, for a class of convolutional neural networks (CoNNs), known as shunting inhibitory convolution neural networks. Furthermore, a new hybrid method is proposed, which is derived from the principles of Quickprop, Rprop, SuperSAB, and least squares (LS). Experimental results show that the new hybrid method can perform as well as the Levenberg-Marquardt (LM) algorithm, but at a much lower computational cost and less memory storage. For comparison sake, the visual pattern recognition task of face/nonface discrimination is chosen as a classification problem to evaluate the performance of the training algorithms. Sixteen training algorithms are implemented for the three different variants of the proposed CoNN architecture: binary-, Toeplitz- and fully connected architectures. All implemented algorithms can train the three network architectures successfully, but their convergence speed vary markedly. In particular, the combination of LS with the new hybrid method and LS with the LM method achieve the best convergence rates in terms of number of training epochs. In addition, the classification accuracies of all three architectures are assessed using ten-fold cross validation. The results show that the binary- and Toeplitz-connected architectures outperform slightly the fully connected architecture: the lowest error rates across all training algorithms are 1.95% for Toeplitz-connected, 2.10% for the binary-connected, and 2.20% for the fully connected network. In general, the modified Broyden-Fletcher-Goldfarb-Shanno (BFGS) methods, the three variants of LM algorithm, and the new hybrid/LS method perform consistently well, achieving error rates of less than 3% averaged across all three architectures.



Link to publisher version (DOI)