21 August 2018 Multisource learning for skeleton-based action recognition using deep LSTM and CNN
Ran Cui, Aichun Zhu, Gang Hua, Hongsheng Yin, Haiqiang Liu
Author Affiliations +
Abstract
Human action recognition based on information concerning the skeleton of the human body has been widely studied because such information can simply and clearly express features related to actions, and is unaffected by physical features of the body. Therefore, this paper discusses action recognition based on three-dimensional skeletal information obtained from RGB-D videos. We propose a multisource action recognition model that combines features of the temporal and spatial domains. Our model focuses on action features from three levels: global-level, local-level, and detail-level, since different actions concern different parts of the body. For temporal features, we adopt long short-term memory to create a model that analyzes skeleton sequence. For features of the spatial domain, we analyze the effects of three features on action recognition: skeleton joint coordinate, pairwise relative position, and speed of movement. Finally, the temporal and spatial domain models are combined into a multisource model for improving the accuracy of action recognition. Experiments show that our model brings about considerable improvement in the recognition of a variety of general actions and interactive activities.
© 2018 SPIE and IS&T 1017-9909/2018/$25.00 © 2018 SPIE and IS&T
Ran Cui, Aichun Zhu, Gang Hua, Hongsheng Yin, and Haiqiang Liu "Multisource learning for skeleton-based action recognition using deep LSTM and CNN," Journal of Electronic Imaging 27(4), 043050 (21 August 2018). https://doi.org/10.1117/1.JEI.27.4.043050
Received: 20 May 2018; Accepted: 24 July 2018; Published: 21 August 2018
Lens.org Logo
CITATIONS
Cited by 7 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Video

Data modeling

RGB color model

Remote sensing

Feature extraction

Neurons

Performance modeling

Back to Top