Static graph convolution with learned temporal and channel-wise graph topology generation for skeleton-based action recognition

Publication Name

Computer Vision and Image Understanding

Abstract

Graph convolutional networks (GCNs) are widely used in skeleton-based action recognition. It is known that the graph topology is a vital part in GCNs, and different kinds of graph topologies have been proposed for skeleton-based action recognition, mostly based on a predefined topology and a dynamically learned one. The predefined topology is based on the human intuition for skeleton (the connectivity of joints) and has not been investigated whether it is optimal. In this paper, we focus on investigating this static graph topology and propose to generate a learned static graph topology for skeleton. To be specific, a temporal frame-wise and channel-wise topology-based GCNs (TC-GCNs) are developed, where, instead of using a predefined topology by human, a topology is learned for skeleton-based action recognition. The TC-GCNs consist of generating a temporal frame-wise topology and a channel-wise topology to formulate the relationship of skeleton joints in the temporal dimension and channel dimension, respectively. The proposed method can be integrated with the conventional dynamic topology by replacing the predefined graph topology with our generated one. Experimental results show that our method with learned static graph achieves better performance than the predefined graph and dynamic graph on three widely used benchmarks, namely the NTU-RGB+D, NTU-RGB+D 120 and UAV-Human.

Open Access Status

This publication is not available as open access

Volume

244

Article Number

104012

Funding Number

62001092

Funding Sponsor

National Natural Science Foundation of China

Share

COinS
 

Link to publisher version (DOI)

http://dx.doi.org/10.1016/j.cviu.2024.104012