Scopus Harvesting Series

DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition

Thanh Dat Truong, University of Arkansas
Quoc Huy Bui, FPT Software
Chi Nhan Duong, Concordia University
Han Seok Seo, University of Arkansas
Son Lam Phung, University of Wollongong
Xin Li, West Virginia University
Khoa Luu, University of Arkansas

Publication Name

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Abstract

Human action recognition has recently become one of the popular research topics in the computer vision community. Various 3D-CNN based methods have been presented to tackle both the spatial and temporal dimensions in the task of video action recognition with competitive results. However, these methods have suffered some fundamental limitations such as lack of robustness and generalization, e.g., how does the temporal ordering of video frames affect the recognition results? This work presents a novel end-to-end Transformer-based Directed Attention (Direc-Former) framework11The implementation of DirecFormer is available at https://github.com/uark-cviu/DirecFormer for robust action recognition. The method takes a simple but novel perspective of Transformer-based approach to understand the right order of sequence actions. Therefore, the contributions of this work are three-fold. Firstly, we introduce the problem of ordered temporal learning issues to the action recognition problem. Secondly, a new Directed Attention mechanism is introduced to understand and provide attentions to human actions in the right order. Thirdly, we introduce the conditional dependency in action sequence modeling that includes orders and classes. The proposed approach consistently achieves the state-of-the-art (SOTA) results compared with the recent action recognition methods [4, 18, 72, 74]. on three standard large-scale benchmarks, i.e. Jester, Kinetics-400 and Something-Something-V2.

Open Access Status

This publication may be available as open access

Volume

2022-June

First Page

19998

Last Page

20008

Funding Sponsor

National Science Foundation

Link to Full Text

COinS

Link to publisher version (DOI)

http://dx.doi.org/10.1109/CVPR52688.2022.01940

Scopus Harvesting Series

DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition

Publication Name

Abstract

Open Access Status

Volume

First Page

Last Page

Funding Sponsor

Link to publisher version (DOI)

Search

Browse

Links

Scopus Harvesting Series

DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition

Authors

Publication Name

Abstract

Open Access Status

Volume

First Page

Last Page

Funding Sponsor

Share

Link to publisher version (DOI)

Search

Browse

Links