A Study on the effects of recursive convolutional layers in convolutional neural networks
To overcome problems with the design of large networks, particularly with respect to the depth of the network, this paper presents a new model of convolutional neural networks (CNN) which features fully recursive convolutional layers (RCLs). An RCL is a generalization of the classic one-stage feedforward convolutional layer (CL) to fully direct feedback connections between the outputs of the CL and its inputs. A traditional deep CNN consisting of many CLs, can then be generalized to include some CLs, and some RCLs in the intermediate stages. We call the corresponding network a Convolutional Neural Network with Fully Recursive Perceptron Network (C-FRPN). Through an analysis of results obtained from applications of the C-FRPN to three benchmark image classification datasets: CIFAR-10, SVHN, ISIC, it is found that (i) in general, the performance of a C-FRPN, even with only one RCL, is better than the performance of the corresponding deep CNN with all CLs, under the constraint of having the same number of unknown parameters; (ii) the performance of the C-FRPN varies with respect to (a) where the RCLs are located, and (b) the number of RCLs in the C-FRPN; and, (iii) the effectiveness of the RCLs depends on the size of the training dataset. The results suggest that: (a) it is advisable to use RCLs particularly when training very large sets of data, (b) it is best to prioritize placement of RCLs close to the input layer of the C-FRPN, and (c) it is advisable to increase the number of RCLs as long as the training dataset can sustain without overfitting being observed.
Open Access Status
This publication is not available as open access
Università degli Studi di Firenze