Zhimin Gao



Degree Name

Doctor of Philosophy


School of Computing and Information Technology


Object-based image classification and retrieval are two fundamental visual recognition and understanding problems, which can help to solve a wide range of applications. Both tasks aim to identify or match one specific kind of object in an image from a set of database images. The objects in those images generally undergo various variations, for example, appearance, translation, scale, pose and viewpoint, which makes these two tasks challenging. Feature representations that can effectively handle various object variations are crucial for both image classification and retrieval tasks, and they have been the driving engine of research in computer vision for the last two decades. In recent years, deep learning based models, especially the deep convolutional neural networks (CNNs), have promoted substantial performance boost for a variety of visual recognition tasks owing to their astonishing feature representation capability. Compared to the conventional shallow hand-crafted features, the feature representations learned by deep CNNs can encode high-level abstractions of objects. How to effectively deploy deep CNNs for specific recognition tasks is still an appealing research field in computer vision and machine learning. This thesis focuses on developing advanced methods that concentrate on deep feature representations to improve the accuracy of object-based image classification and retrieval from the following three aspects.