Master of Engineering by Research
School of Electrical, Computer and Telecommunications Engineering - Faculty of Informatics
Shao, Wenbin, Automatic annotation of digital photos, Master of Engineering by Research thesis, School of Electrical, Computer and Telecommunications Engineering - Faculty of Informatics, University of Wollongong, 2007. http://ro.uow.edu.au/theses/701
Content-based image retrieval searches for an image by using a set of visual features that characterize the image content. This technique has been used in many areas, such as geographical information processing, space science, biomedical image processing, target recognition in military applications and bioinformatics. Many approaches have been proposed to reduce the gap between the low-level visual features and high-level contents. In this thesis, a multi-class automatic annotation system is developed to bridge the semantic gap. Given an image, the proposed system will automatically generate keywords corresponding to the image contents. The system is evaluated using a large image database consisting of over 16000 images collected from various online repositories.
The proposed multi-class annotation system is based on salient features and support vector machines (SVMs). A new feature called gradient direction histogram is proposed for image classification. Instead of relying on a single feature, the SVMs in our system can automatically select the most suitable features from a pool of six MPEG-7 visual descriptors and the proposed gradient direction histogram. Multi-class SVMs are constructed using two-class SVMs in different combinations.
We have examined several multi-class support vectormachines including one-versus-all SVMs, pair-wise SVMs and decision directed acyclic graph SVMs. The results confirm that the pair-wise and decision directed acyclic graph SVMs are suitable for multi-class applications. In pair-wise SVMs, we propose a voting scheme named confidence score voting. Our results show that, compared to majority voting, confidence score voting improves the classification accuracy. Combining salient features leads to a significant improvement in the classification rate.
The proposed system is compared to k-nearest neighbours and neural networks using the same dataset. The results show that the proposed system outperforms these two classifiers in the four-class classification problem. The research project also investigates the system performance when the input image is cropped, resized or rotated.
02Whole.pdf (2586 kB)