Faculty of Engineering and Information Sciences - Papers: Part B

Depth Pooling Based Large-Scale 3-D Action Recognition with Convolutional Neural Networks

Pichao Wang, University of WollongongFollow
Wanqing Li, University of WollongongFollow
Zhimin Gao, University of WollongongFollow
Chang Tang, Tianjin University, University of Wollongong
Philip O. Ogunbona, University of WollongongFollow

RIS ID

127096

Publication Details

Wang, P., Li, W., Gao, Z., Tang, C. & Ogunbona, P. (2018). Depth Pooling Based Large-Scale 3-D Action Recognition with Convolutional Neural Networks. IEEE Transactions on Multimedia, 20 (5), 1051-1061.

Abstract

1999-2012 IEEE. This paper proposes three simple, compact yet effective representations of depth sequences, referred to respectively as dynamic depth images (DDI), dynamic depth normal images (DDNI), and dynamic depth motion normal images (DDMNI), for both isolated and continuous action recognition. These dynamic images are constructed from a segmented sequence of depth maps using hierarchical bidirectional rank pooling to effectively capture the spatial-temporal information. Specifically, DDI exploits the dynamics of postures over time, and DDNI and DDMNI exploit the 3-D structural information captured by depth maps. Upon the proposed representations, a convolutional neural network (ConvNet)-based method is developed for action recognition. The image-based representations enable us to fine-tune the existing ConvNet models trained on image data without training a large number of parameters from scratch. The proposed method achieved the state-of-art results on three large datasets, namely, the large-scale continuous gesture recognition dataset (means the Jaccard index 0.4109), the large-scale isolated gesture recognition dataset (59.21%), and the NTU RGB+D dataset (87.08% cross-subject and 84.22% cross-view) even though only the depth modality was used.

Download

Included in

Engineering Commons, Science and Technology Studies Commons

COinS

Link to publisher version (DOI)

http://dx.doi.org/10.1109/TMM.2018.2818329

Faculty of Engineering and Information Sciences - Papers: Part B

Depth Pooling Based Large-Scale 3-D Action Recognition with Convolutional Neural Networks

RIS ID

Publication Details

Abstract

Included in

Link to publisher version (DOI)

Search

Browse

Links

Faculty of Engineering and Information Sciences - Papers: Part B

Depth Pooling Based Large-Scale 3-D Action Recognition with Convolutional Neural Networks

Authors

RIS ID

Publication Details

Abstract

Included in

Share

Link to publisher version (DOI)

Search

Browse

Links