Degree Name

Master of Philosophy


School of Electrical, Computer and Telecommunications Engineering


Underwater mines are a cost-effective method in asymmetric warfare, and are commonly used to block shipping lanes and restrict naval operations. Consequently, they threaten commercial and military vessels, disrupt humanitarian aids, and damage sea environments. There is a strong international interest in using sonars and AI for mine countermeasures and undersea surveillance. High-resolution imaging sonars are well-suited for detecting underwater mines and other targets. Compared to other sensors, sonars are more effective for undersea environments with low visibility.

This project aims to investigate deep learning algorithms for two important tasks in undersea surveillance: naval mine detection and seabed terrain segmentation. Our goal is to automatically classify the composition of the seabed and localise naval mines.

This research utilises the real sonar data provided by the Defence Science and Technology Group (DSTG). To conduct the experiments, we annotated 150 sonar images for semantic segmentation; the annotation is guided by experts from the DSTG.We also used 152 sonar images with mine detection annotations prepared by members of Centre for Signal and Information Processing at the University of Wollongong.

Our results show Faster-RCNN to achieve the highest performance in object detection. We evaluated transfer learning and data augmentation for object detection. Each method improved our detection models mAP by 11.9% and 16.9% and mAR by 17.8% and 21.1%, respectively. Furthermore, we developed a data augmentation algorithm called Evolutionary Cut-Paste which yielded a 20.2% increase in performance. For segmentation, we found highly-tuned DeepLabV3 and U-Nett++models perform best. We evaluate various configurations of optimisers, learning rate schedules and encoder networks for each model architecture. Additionally, model hyper-parameters are tuned prior to training using various tests. Finally, we apply Median Frequency Balancing to mitigate model bias towards frequently occurring classes. We favour DeepLabV3 due to its reliable detection of underrepresented classes as opposed to the accurate boundaries produced by U-Nett++. All of the models satisfied the constraint of real-time operation when running on an NVIDIA GTX 1070.



Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.