posted on 2024-11-12, 13:50authored byJoshua Luke Thompson
In the past decade, deep learning (DL) has taken the world by storm. It has produced significant results in a wide variety of applications ranging from self driving cars to natural language processing (NLP). Modern deep learning is built from a number of different algorithms including artificial neural networks (ANN), optimisation algorithms, back-propagation (BP), and varying levels of supervision. Recent advances in GPU hardware, improved availability of large, high quality datasets, and the development of modern training algorithms have all played a pivotal role in the emergence of modern deep learning. These advances have made it easier to train and deploy deeper neural networks that exhibit great generalisation and state-of-the-art, (SOTA), results. Scene understanding is a critical topic in computer vision. In recent years, semantic segmentation and monocular depth estimation have emerged as two key methods for achieving this goal. The combination of these two tasks enables a system to determine both the features in an environment through semantic segmentation, and the 3-D geometric information of those features through depth estimation. This has many practical applications including autonomous driving, robotics, assistive navigation, and virtual reality. Many of these applications require both tasks to be performed simultaneously, however most methods use a separate model for each task which is very computationally resource intensive. Combining multiple tasks into a single model is both computationally efficient and effectively leverages the interrelations between tasks to generate reliable, accurate predictions. The use of a single model for two or more tasks is called multi-task learning (MTL). Despite recent advances in multi-task learning, most MTL models fall short of their single-task counterparts, and often have poor computational resource usage.
History
Year
2023
Thesis type
Doctoral thesis
Faculty/School
School of Electrical, Computer and Telecommunications Engineering
Language
English
Disclaimer
Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.