posted on 2024-11-14, 11:20authored byPichao Wang, Zhaoyang Li, Yonghong Hou, Wanqing LiWanqing Li
Recently, Convolutional Neural Networks (ConvNets) have shown promising performances in many computer vision tasks, especially image-based recognition. How to effectively use ConvNets for video-based recognition is still an open problem. In this paper, we propose a compact, effective yet simple method to encode spatiotemporal information carried in 3D skeleton sequences into multiple 2D images, referred to as Joint Trajectory Maps (JTM), and ConvNets are adopted to exploit the discriminative features for realtime human action recognition. The proposed method has been evaluated on three public benchmarks, i.e., MSRC-12 Kinect gesture dataset (MSRC-12), G3D dataset and UTD multimodal human action dataset (UTD-MHAD) and achieved the state-of-the-art results.
History
Citation
Wang, P., Li, Z., Hou, Y. & Li, W. (2016). Action recognition based on joint trajectory maps using convolutional neural networks. Proceedings of the 2016 ACM on Multimedia Conference (pp. 102-106). New York, United States: ACM.
Parent title
MM 2016 - Proceedings of the 2016 ACM Multimedia Conference