Faculty of Engineering and Information Sciences - Papers: Part B

Jointly Learning Visual Poses and Pose Lexicon for Semantic Action Recognition

Lijuan Zhou, University of Wollongong, Zhengzhou UniversityFollow
Wanqing Li, University of WollongongFollow
Philip O. Ogunbona, University of WollongongFollow
Zhengyou Zhang, Microsoft

RIS ID

132622

Publication Details

Zhou, L., Li, W., Ogunbona, P. & Zhang, Z. (2019). Jointly Learning Visual Poses and Pose Lexicon for Semantic Action Recognition. IEEE Transactions on Circuits and Systems for Video Technology, Online First 1-11.

Abstract

A novel method for semantic action recognition through learning a pose lexicon is presented in this paper. A pose lexicon comprises a set of semantic poses, a set of visual poses and a probabilistic mapping between visual and semantic poses. This paper assumes that both visual poses and mapping are hidden and proposes a method to simultaneously learn a visual pose model that estimates the likelihood of an observed video frame being generated from hidden visual poses, and a pose lexicon model that establishes the probabilistic mapping between the hidden visual poses and the semantic poses parsed from textual instructions. Specifically, the proposed method consists of two-level hidden Markov models. One level represents the alignment between the visual poses and semantic poses. The other level represents a visual pose sequence and each visual pose is modelled as a Gaussian mixture. An Expectation-maximization algorithm is developed to train a pose lexicon. With the learned lexicon, action classification is formulated as a problem of finding the maximum posterior probability of a given sequence of video frames that follows a given sequence of semantic poses, constrained by the most likely visual pose and the alignment sequences. The proposed method was evaluated on MSRC- 12, WorkoutSU-10, WorkoutUOW-18, Combined-15, Combined- 17 and Combined-50 action datasets using cross-subject, crossdataset, zero-shot and seen/unseen protocols.

Please refer to publisher version or contact your library.

COinS

Link to publisher version (DOI)

http://dx.doi.org/10.1109/TCSVT.2019.2890829

Faculty of Engineering and Information Sciences - Papers: Part B

Jointly Learning Visual Poses and Pose Lexicon for Semantic Action Recognition

RIS ID

Publication Details

Abstract

Link to publisher version (DOI)

Search

Browse

Links

Faculty of Engineering and Information Sciences - Papers: Part B

Jointly Learning Visual Poses and Pose Lexicon for Semantic Action Recognition

Authors

RIS ID

Publication Details

Abstract

Share

Link to publisher version (DOI)

Search

Browse

Links