Humans face the risk of injuries in the cause of their daily activities and this poses huge economic loss to the individual and society at large. One of the main causes of injuries is poor posture where body parts are at awkward or extreme positions which increase stress to nearby joints and muscles. Postural assessment techniques based on biomechanical principles hold promise to prevent poor posture through early identification of posture risks. This thesis focuses on postural assessment from images, aiming at a fully automatic, low cost, convenient, and reliable method/system for application in daily lives. Specifically, a novel one-stage framework is proposed for postural assessment where input images are processed into features that form the basis of further analysis. In the framework, the poses are not explicitly measured or estimated as done in existing methods, avoiding associated issues including error accumulation and information loss during multiple stages. Based on the proposed framework four methods are proposed for realtime postural assessment by adapting techniques in computer vision and machine learning, in particular deep learning. The first method is for real-time upper-body postural assessment and consists of histogram of oriented gradients (HOGs) for feature extraction and support vector machine (SVM) for inference. In addition, we improved the HOG to weighted HOG (WHOG) by increasing weights assigned to features from key body parts. Two types of key parts are explored – ergonomics-based and visual-based, and experimental results show that key parts are not necessarily located at ergonomics body parts for an accurate assessment. In the second method, utilizing the advances in deep learning, an end-to-end trainable convolutional neural networks (CNNs) is proposed for real-time upper-body postural assessment. The network is trained to extract features from key parts which are learned from datasets instead of explicitly selected or detected. In addition, the network could learn posture related geometric information by enforcing pose similarities in learned features based on triplet rank loss. The third method aims at automatic whole-body postural assessment. To further improve the reliability of extracted features and reduce the training complexity in the second method, an attention module adapted from the spatial transformer (ST) is proposed. Multiple regions from the images were identified for feature extraction and used to train the attention module based on the risk labels. In addition, the geometric relationships among body parts are explicitly encoded as features for analysis. In the fourth method, we studied postural assessment from a short sequence of 2D skeletons. Graph convolutional networks (GCN) are extended to extract spatio-temporal features from streams of joints and bones that are constructed from the skeletons with shared layers between the two streams. The sharing mechanism has led to significant improvement of assessment accuracy compared to the architecture without sharing. In addition, two large-scale image-based datasets are created for the study of postural assessment. One is for upper body postural assessment and the other is for whole body. Extensive performance evaluations of the four methods on corresponding datasets have been conducted and results verify the effectiveness of the methods.
History
Year
2021
Thesis type
Doctoral thesis
Faculty/School
School of Computing and Information Technology
Language
English
Disclaimer
Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.