Year

2023

Degree Name

Master of Philosophy (Computer Science)

Department

School of Computing and Information Technology

Abstract

Digitised archival photo collections allow members of the public to view images relating to history and democracy. Recent advancements in visual tasks such as Content Based Image Retrieval and the development of deep neural networks have provided modern methods to analyse digitised images and perform image queries for retrieval. We explore the image retrieval task using several publicly available datasets, and a set of archival images from the National Archives of Australia, and propose a simple change to existing pooling method to improve retrieval performance in the archival set.

Another visual task of object localisation considers the ability of a model to be trained to adequately locate in an image the positions of objects, given English text phrases. With other recent advances in large-scale text embedding models, pre-trained text models retain rich semantic structure within them. While other methods of object localisation involve the training of text pathways in their deep neural model, we explore direct use of a large-scale text embedding for this task, and demonstrate its ability to localise objects, and even on unseen words.

FoR codes (2008)

080104 Computer Vision, 170203 Knowledge Representation and Machine Learning, 080704 Information Retrieval and Web Search, 080108 Neural, Evolutionary and Fuzzy Computation

Share

COinS
 

Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.