Improved Shift Graph Convolutional Network for Action Recognition With Skeleton

Publication Name

IEEE Signal Processing Letters


Shift graph convolutional network (Shift-GCN) achieves remarkable performance for skeleton based action recognition with lower computational complexity than other GCN based methods. However, the current Shift-GCN, with one spatial shift, a static mask and a local temporal convolution, cannot fully explore the spatial-temporal features among skeleton joints of different frames. In order to address these problems, an improved shift graph convolutional network (Ishift-GCN) is proposed in this letter. The Ishift-GCN consists of two parts including a bidirectional spatial shift graph convolution with a dynamic mask, and a multi-scale temporal shift graph convolution. The bidirectional spatial shift graph convolution exploits more spatial information among joints, and the dynamic mask with stronger generalization ability can learn different correlations among features of different joints for different actions. The multi-scale temporal shift graph convolution captures more temporal information by complementing the shifted features with multi-scale convolution. Furthermore, knowledge distillation is used to reduce computational complexity. Compared with Shift-GCN, the proposed Ishift-GCN achieves better results with less computation complexity on two widely used benchmarks, namely the NTU-RGB+D and UAV-Human dataset.

Open Access Status

This publication is not available as open access



Link to publisher version (DOI)