Year

2005

Degree Name

Doctor of Philosophy (PhD)

Department

School of Electrical, Computer and Telecommunications Engineering - Faculty of Informatics

Abstract

Due to the popularity of multimedia applications, many efforts have been directed towards presenting new services and functionalities such as interactivity, manipulation, content-based retrieval, scalability, etc. Object-based image/video representation and processing is one of the approaches considered to meet these desired functionalities. However, semantic image and video segmentation is one of the unresolved challenges of this approach. Although many works on segmentation have contributed towards this goal, there are still numerous areas requiring further research. In this work, a comprehensive range of image and video segmentation algorithms, including low and high level phases, are proposed, tested and analysed. In the low level phase, the image/frame is partitioned into homogeneous regions while in the high level phase , the 'objects-of-interest' are extracted. The proposed algorithms are useful for generic segmentation applications, in particular for scalable coding, which distributes information over heterogeneous networks. One of the requirements of the scalable coding is that the shapes of an object produced at different resolutions should be similar, more precisely, the low resolution objects should be the down sampled version of the higher resolution objects. A multidimensional processing integrated with the multiresolution segmentation processing reduces computational complexity and provides a scalability feature for the extracted objects/regions at different resolutions, which is necessary for the scalable coding algorithms. Including smoothness as a visual quality criterion in the segmentation and classification algorithms improves the visual effect of the segmentation results. To meet the scalability and smoothness constraints, a Markov Random Field (MRF) framework with enough flexibility to meet the constraints is utilised. The proposed algorithm is a reliable and effective low level segmentation which includes the desirable features of both single and multiresolution segmentation algorithms. Different objective and subjective tests such as number of regions, discriminating between meaningful regions, smoothness and examination of visual attractiveness by measuring/estimating the smoothness function confirm the superiority of the proposed scalable algorithm over the regular single and multiresolution segmentation algorithms. The novel objective function gives flexibility to the proposed algorithm to segment YUV colour images where Y is in full resolution but U and V are in half resolution. At the high level phase of the image segmentation process, a hierarchical searching method for extracting the 'object-of-interest' is introduced. The search is based on the concept of the global precedence effect (GPE) of the human visual system (HVS) which searches for the large (global) objects before the small (local) ones. The proposed algorithm compares different combination of regions with the 'object-of- interest' template to find the best combination. An irregular pyramid is developed which retains the global objects at the lower levels. A hierarchical search for the 'object-of-interest' template starts from the lowest level of this pyramid. This natural priority in searching is very useful when the 'object-of-interest' is the main object in the image. The computational complexity of the search is reduced significantly. In video segmentation, the 'object-of-interest' in the first frame is determined either by user's intervention or the proposed 'object-of-interest' extraction algorithm. In the subsequent frames, regions generated by the spatial segmentation are grouped into foreground and background areas by a MRF-based classification algorithm. The objective function of the classification algorithm includes spatial and temporal continuity, motion constraints and smoothness terms. The proposed algorithm tracks the objects detected at the previous frames and extracts the newly appearing objects in the current frame. The algorithm is developed in scalable multiresolution mode where the corresponding regions at the lower and higher resolutions are processed and analysed together. The proposed algorithm extracts moving objects at different resolutions with scalability and visual quality (smoothness) as constraints. It allows larger motion detection, better noise tolerance and less computational complexity.

Share

COinS
 

Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.