posted on 2024-11-17, 13:48authored byJohan Chagnon, Markus Hagenbuchner, Ah Chung Tsoi, Franco Scarselli
The development of Convolutional Neural Networks (CNNs) trends towards models with an ever growing number of Convolutional Layers (CLs) and increases the number of trainable parameters significantly. Such models are sensitive to these structural parameters, which implies that large models have to be carefully tuned using hyperparameter optimisation, a process that can be very time consuming. In this paper, we study the usage of Recursive Convolutional Layers (RCLs), a module relying on an algebraic feedback loop wrapped around a CL, which can replace any CL in CNNs. Using three publicly available datasets, CIFAR10, CIFAR100 and SVHN, and a simple model comprised of 4 RCLs, we compare its performances with those obtained by its feedforward counterpart, and exhibit some core properties and use-cases of RCLs. In particular, we show that RCLs can lead to models of better performances, and that reducing the number of modules from four to one lead to a decrease in accuracy of 3.5% on average for models using RCLs, compared to 23% using CLs. Hence, the resulting architecture is much more robust to the addition or the removal of layers. We conclude by relating the effects obtained using additional CLs with those obtained using additional recursion on RCLs, which provides incentives that the latter can simulate an increase of depth but with no extra cost of parameters. Such results point to the potential benefits of either selectively or replacing all CLs by RCLs, in most recently introduced CNNs.
Funding
Australian Research Council (DP210102674)
History
Journal title
Proceedings of the International Joint Conference on Neural Networks