Degree Name

Doctor of Philosophy


School of Electrical, Computer and Telecommunications Engineering


Hand gesture recognition has been applied to many fields in recent years, especially in man-machine interaction (MMI) area, which is regarded as a more natural and flexible input than the traditional input, such as, mice and keyboard. Microsoft Kinect camera has also drastically changed the world of human computer interaction based computer vision, due to its low cost and high quality of depth information for visual images. This has made the depth data to become common place at a very low cost allowing myriad of computer vision related applications including hand gesture recognition. Hand gesture recognition research suffered severely from the clutter and skintone regions in any background. With the availability of depth information, background clutter and skintone regions which are not part of the hand gesture can be removed improving the performance of any classification strategy. In this thesis, an overview of hand gesture recognition research up to date is presented, which includes common stages of hand gesture recognition, common methods and technique of each stage, the state of the recent research and summaries of some successful hand gesture recognition models. This article also discusses a novel hand detection strategy based on Kinect camera by combining depth and colour image information. In the detection procedure, the Kalman filter is applied to tracking process to achieve a good detection result. The experiment results in chapter 3 show this detection method is reliable and stable in the clutter background, and works well in various light conditions.

Gesture recognition is an important and challenging task in the field of computer vision. Starting from the 3D shape of coding gestures, it puts forward a new kind of gesture recognition framework based on depth image. It extracts the space characteristics of a variety of 3D point cloud based on Kinect, including local principal components analysis on point cloud to get the histogram of main component, gradient direction histogram based on local depth difference and depth distribution histogram of local point cloud. Principal component histogram and gradient direction histogram effectively coding the local shape of gestures, depth distribution histogram compensates the loss of the shaping descriptor information. In this thesis, through preliminary training of random forest classifier to filter the characteristics, and characteristics with less influence on classification results are removed, thus the computational costs are reduced. The filtered characteristics are used for training of random forest classifier again to classify gestures. the experiment is carried on two large-scale gesture data sets,which is shown in this thesis, for more diffcult ASL dataset, the proposed method has improved the recognition rate of 3.6% then the previous algorithm. This thesis shows a good prospect of hand gesture recognition based its high recognition accuracy and speed.