Year

2023

Degree Name

Doctor of Philosophy

Department

School of Electrical, Computer and Telecommunications Engineering

Abstract

In the past decade, deep learning (DL) has taken the world by storm. It has produced significant results in a wide variety of applications ranging from self driving cars to natural language processing (NLP). Modern deep learning is built from a number of different algorithms including artificial neural networks (ANN), optimisation algorithms, back-propagation (BP), and varying levels of supervision. Recent advances in GPU hardware, improved availability of large, high quality datasets, and the development of modern training algorithms have all played a pivotal role in the emergence of modern deep learning. These advances have made it easier to train and deploy deeper neural networks that exhibit great generalisation and state-of-the-art, (SOTA), results.

Scene understanding is a critical topic in computer vision. In recent years, semantic segmentation and monocular depth estimation have emerged as two key methods for achieving this goal. The combination of these two tasks enables a system to determine both the features in an environment through semantic segmentation, and the 3-D geometric information of those features through depth estimation. This has many practical applications including autonomous driving, robotics, assistive navigation, and virtual reality. Many of these applications require both tasks to be performed simultaneously, however most methods use a separate model for each task which is very computationally resource intensive. Combining multiple tasks into a single model is both computationally efficient and effectively leverages the interrelations between tasks to generate reliable, accurate predictions. The use of a single model for two or more tasks is called multi-task learning (MTL). Despite recent advances in multi-task learning, most MTL models fall short of their single-task counterparts, and often have poor computational resource usage.

Recommended Citation

Thompson, Joshua Luke, Efficient Deep Neural Networks for 3-D Scene Understanding of Unstructured Environments, Doctor of Philosophy thesis, School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, 2023. https://ro.uow.edu.au/theses1/1717

FoR codes (2008)

0801 ARTIFICIAL INTELLIGENCE AND IMAGE PROCESSING, 0802 COMPUTATION THEORY AND MATHEMATICS, 0803 COMPUTER SOFTWARE

Download

COinS

Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.

University of Wollongong Thesis Collection 2017+

Efficient Deep Neural Networks for 3-D Scene Understanding of Unstructured Environments

Year

Degree Name

Department

Abstract

Recommended Citation

FoR codes (2008)

Search

Browse

Links

University of Wollongong Thesis Collection 2017+

Efficient Deep Neural Networks for 3-D Scene Understanding of Unstructured Environments

Author

Year

Degree Name

Department

Abstract

Recommended Citation

FoR codes (2008)

Share

Search

Browse

Links