Master of Philosophy
School of Computing and Information Technology
Due to the advances in the internet and digital image sensors, there has been a drastic in-crease in the volume of visual content. This visual content was generated and shared from various sources, such as medical archival photos, social media platforms, the education sector, and robotics. This resulted in new challenges pertaining to users’ interests, i.e., content-based image retrieval (CBIR), which is a long-established research area.
As an initial step to attain the final aim of this thesis, a dive-in into content-based image retrieval using a deep convolutional network was attempted to explore the existing pooling mechanisms, diffusion mechanisms for image retrieval, and the challenges encountered in attaining the task. Experimental results are reported on various image retrieval benchmark datasets and a community archival photo dataset. Results on visual grounding, another computer vision task that has gained pace were also reported on the community archival photo dataset as a part of the initial study.
Chakraborty, Bela, Inter-modality Fusion based Attention for Zero-shot Cross-modal Retrieval, Master of Philosophy thesis, School of Computing and Information Technology, University of Wollongong, 2021. https://ro.uow.edu.au/theses1/1168
FoR codes (2008)
080104 Computer Vision, 080106 Image Processing, 080107 Natural Language Processing, 080199 Artificial Intelligence and Image Processing not elsewhere classified
Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.