posted on 2025-09-24, 00:03authored byJohan Chagnon
<p dir="ltr">A common approach to improving the performance of Convolutional Neural Networks (CNNs) is by adding Convolutional Layers (CLs). This increases their physical depth enabling CNNs to model a larger receptive field and to leverage from a hierarchical representation of the features. However, this approach has several disadvantages. Not only does it lead to an increased number of parameters, computational overhead and risk of overfitting, this phenomenon also diverges from our understanding of the humans’ visual system which relies on few cortical areas and includes feedback connections. In contrast, creating a Recursive Convolutional Layer (RCL) by adding a feedback loop over a CL offers a compelling alternative. By relying on weight sharing, these RCLs can unfold into an arbitrarily long sequence of CLs without additional parameters, effectively increasing the number of feature interactions, which we refer to as the apparent depth. As a consequence, earlier works showed that RCLs perform well in shallow architectures but face challenges in deeper networks, giving rise to the opinion that they are better when combined with a gating mechanism.</p><p dir="ltr">The current literature on RCLs lacks consistency in their experimental settings, making it challenging to consolidate and unify the existing findings. This inconsistency stems from a predominant focus on the optimisation of the performances which obscured the understanding of RCLs and the trade-offs that are involved. Our research addresses this limitation by focusing on a systematic investigation of the underlying mechanisms responsible for the behaviour of RCLs. Specifically, our study relies on non-gated RCLs with a particular emphasis on apparent depth with weight sharing, providing a more comprehensive and fair comparison of feedforward and recursive models. This is achieved through a novel approach to transform feedforward models into recursive alternatives: rather than replacing individual CLs, we substitute entire stages of CNNs consisting of repeated CLs blocks with a single non-gated RCL while preserving the apparent depth. Through a combination of empirical results and feature analysis, we investigate the behaviour of recursive models in order to achieve a better understanding and consequently to derive conditions under which RCLs can be beneficial when it is introduced in feedforward CNNs.</p><p dir="ltr">Our findings indicate that RCLs effectively simulate the behaviour of their physically much deeper feedforward counterpart. Specifically, increasing the unfolding depth expands the effective receptive field which is combined with a gradually more apparent reliance on centralised and global patterns to assert the prediction. This process mirrors the effect of sequences of CLs in the evolution of the feature space, progressively refining their representations despite using weight sharing. Thus, when the apparent depth and the number of parameters is matched, weight sharing capture more features per layer, which makes RCLs more flexible for diverse tasks. As a consequence, we find that in a large majority of circumstances, the recursive non-dense models maintain or improve the performances, particularly in settings with limited number of parameters over that of feedforward models. Despite these advantages, we identify several cases where recursion may degrade performances. While RCLs are particularly beneficial when used with lightweight convolutional blocks, RCLs leverage less from dense connections and from multiscale feature extraction when the model is large. Furthermore, when compared to their feedforward counterpart, they are found to be more sensitive to overfitting, and excessive unfolding can degrade performance and unnecessarily increase computational overhead.</p><p dir="ltr">From the results of this study, we develop an algorithm to automate the process of transforming a feedforward network into an efficient recursive variant. The algorithm exploits the strengths of RCLs, mitigates their limitations, and provides a foundation for future studies. By demonstrating the potential of recursive models to maintain performance with fewer parameters in controlled settings, this research opens new avenues for developing reliable and more efficient alternative to feedforward CNNs, offering a promising solution for resource-constrained environments and numerous other applications.</p>
History
Year
2025
Thesis type
Doctoral thesis
Faculty/School
School of Computing and Information Technology
Language
English
Disclaimer
Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.