Deep learning, including the Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN), has enjoyed great success in the last decade. They have been widely applied in the areas of image and video-related tasks, such as image recognition and action recognition. Despite their great success, there are still some fundamental problems that have not been resolved. This thesis addresses the generic challenges in the CNNs and RNNs as well as challenges specific to their use in human activity recognition. First, an RNN-based pooling function is developed to replace the handcrafted and predefined pooling functions. Together with the other layers of the CNNs, this allows all components of the network to be trained from data. Then an independently recurrent neural network (IndRNN) is proposed to solve the gradient vanishing and exploding problem in the conventional RNNs. Unlike the traditional RNNs, IndRNN can learn very long-term patterns (over 5000 time steps) and can be stacked to construct very deep networks (over 21 layers). Application of IndRNN to activity recognition is studied where a deep IndRNN based attention model is developed. Finally, applications in complex scenes are explored using camouflaged moving background modelling. Extensive experiments have been conducted to validate the methods proposed in this thesis.
History
Year
2018
Thesis type
Doctoral thesis
Faculty/School
School of Computing and Information Technology
Language
English
Disclaimer
Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.