Master of Engineering by Research
School of Electrical, Computer and Telecommunications Engineering
Sun, Huiguang, Mobile visual search, Master of Engineering by Research thesis, School of Electrical, Computer and Telecommunications Engineering, University of Wollongong, 2013. http://ro.uow.edu.au/theses/4108
With high-resolution cameras, large storage, high computation power CPU, and fast wireless network connection, mobile phones have evolved into powerful image processing and transmitting devices. Increasing amounts of visual data are uploaded and downloaded between the users and service providers on mobile platforms. Mobile software applications for processing the visual data have been developed significantly in recent years. These applications are supported by Content Based Image Retrieval (CBIR) technology. Mobile Visual Search (MVS), as a type of new research area in CBIR, can provide the services of search and retrieval of visual information specifically for mobile devices.
This project investigates a mobile visual search system using image synthesis and sparse coding. Local features are extracted from the image and fed to a feature aggregation algorithm to form a feature vector describing the content of the image. To improve the affine invariance of content descriptions, the SIFT features are extracted from synthesized and original images. The sparse coding algorithm is then employed to aggregate SIFT features into a compact visual descriptor. The dimensionality of the generated descriptor is further reduced using Principal Component Analysis (PCA). Image synthesis parameters, pooling schemes and the size of the compressed feature vector are explored to find the optimum visual search performance.
To reduce the computation cost and improve the efficiency, a feature selection method using Graph-Based Visual Saliency (GBVS) detection is proposed. Only the salient features located inside the detected saliency map are used to represent the dominant objects in the image. TheK-Nearest Neighbor (KNN) classifier is used to match the visual descriptor of the query image with those of the database images. The proposed visual descriptor is tested on two datasets; it achieves a Top-4-Score of 3.47 on the UK benchmark database and a mean Average Precision of 59.2% on the Holidays database. Furthermore, the proposed visual descriptor outperforms three state-of-the-art visual descriptors, namely Bag of Features (BoF), Fisher Vector (FV) and Vector of Locally Aggregated Descriptor (VLAD). In the proposed mobile visual search system, the proposed visual descriptor and GBVS feature selection scheme are employed. Evaluated on the same two datasets, it was found that the features can be reduced by 25% at the cost of less than 1% reduction in retrieval accuracy