University of Wollongong
Browse

Efficient Deep Neural Networks for 3-D Scene Understanding of Unstructured Environments

Download (74.49 MB)
thesis
posted on 2024-11-12, 13:50 authored by Joshua Luke Thompson
In the past decade, deep learning (DL) has taken the world by storm. It has produced significant results in a wide variety of applications ranging from self driving cars to natural language processing (NLP). Modern deep learning is built from a number of different algorithms including artificial neural networks (ANN), optimisation algorithms, back-propagation (BP), and varying levels of supervision. Recent advances in GPU hardware, improved availability of large, high quality datasets, and the development of modern training algorithms have all played a pivotal role in the emergence of modern deep learning. These advances have made it easier to train and deploy deeper neural networks that exhibit great generalisation and state-of-the-art, (SOTA), results. Scene understanding is a critical topic in computer vision. In recent years, semantic segmentation and monocular depth estimation have emerged as two key methods for achieving this goal. The combination of these two tasks enables a system to determine both the features in an environment through semantic segmentation, and the 3-D geometric information of those features through depth estimation. This has many practical applications including autonomous driving, robotics, assistive navigation, and virtual reality. Many of these applications require both tasks to be performed simultaneously, however most methods use a separate model for each task which is very computationally resource intensive. Combining multiple tasks into a single model is both computationally efficient and effectively leverages the interrelations between tasks to generate reliable, accurate predictions. The use of a single model for two or more tasks is called multi-task learning (MTL). Despite recent advances in multi-task learning, most MTL models fall short of their single-task counterparts, and often have poor computational resource usage.

History

Year

2023

Thesis type

  • Doctoral thesis

Faculty/School

School of Electrical, Computer and Telecommunications Engineering

Language

English

Disclaimer

Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.

Usage metrics

    Categories

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC