Unsupervised image classification by probabilistic latent semantic analysis for the annotation of images
Image annotation has been identified to be a suitable means by which the semantic gap which has made the accuracy of Content-based image retrieval unsatisfactory be eliminated. However existing methods of automatic annotation of images depends on supervised learning, which can be difficult to implement due to the need for manually annotated training samples which are not always readily available. This paper argues that the unsupervised learning via Probabilistic Latent Semantic Analysis provides a more suitable machine learning approach for image annotation especially due to its potential to based categorisation on the latent semantic content of the image samples, which can bridge the semantic gap present in Content Based Image Retrieval. This paper therefore proposes an unsupervised image categorisation model in which the semantic content of images are discovered using Probabilistic Latent Semantic Analysis, after which they are clustered into unique groups based on semantic content similarities using K-means algorithm, thereby providing suitable annotation exemplars. A common problem with categorisation algorithms based on Bag-of-Visual Words modelling is the loss of accuracy due to spatial incoherency of the Bag-of-Visual Word modelling, this paper also examines the effectiveness of Spatial pyramid as a means of eliminating spatial incoherency in Probabilistic Latent Semantic Analysis classification.