Doctor of Philosophy
School of Computing and Information Technology
Medical image synthesis and segmentation are two essential per-voxel prediction tasks, which can be applied in various clinical situations and can assist professional experts to solve various practical problems. Both of these two tasks need to estimate the target value of each voxel in input medical images. For synthesis, the target value is the intensity in the target image; for segmentation, the target value is the class label indicating whether the voxel belongs to the region of interest or not. Compared with many other computer vision tasks, like image classification that estimates image-level label, high-quality medical image synthesis and segmentation require to capture more visual details of both local and global context to predict the dense voxel information. In recent years, various deep neural network models have been developed to solve data processing problems, as they possess the powerful capacity of extracting task-specific features from input data. Among these models, deep convolutional neural networks (CNNs) have shown their promising performance in the field of computer vision. For generic image per-pixel prediction tasks, the delicately designed CNNs can capture the crucial underlying features that represent both the local and global knowledge from given images and efficiently estimate their corresponding target outputs. Owing to the astonishing learning capability of CNNs, this thesis aims to explore more effective and efficient deep CNNs based methods to solve the medical image per-voxel prediction problems. Since medical images have their own characteristics, such as higher image dimensions and less accessible labeled data than generic images, it is challenging to design CNNs based models to fully and explicitly exploit these characteristics and address the learning issues caused by them. Therefore, this thesis also focuses on exploring more advanced learning techniques and integrating them into the learning of deep CNNs to further improve the per-voxel prediction performance on medical images. Specifically, this thesis unfolds its investigation on medical image synthesis and segmentation from the following four aspects.
Firstly, this thesis develops adversarial learning based deep CNN models for crossmodality magnetic resonance (MR) image synthesis. Although deep CNNs have the dominant strength in image feature extraction, the CNNs based generative adversarial networks (GANs) have shown more promising performance for generic image synthesis recently. With the adversarial competition between the generator and discriminator, GANs can synthesize more realistic images than the conventional CNNs. However, if directly applying these GANs on medical image synthesis, the final results will not meet the expectation. One of the reasons is that many medical images, such as MR images, have three dimensions. The 2D GANs that are commonly used on generic images easily fail to capture the continuous visual clues across the 2D slices of the input MR images. To deal with this problem, 3D CNNs based GANs model is developed in this thesis to learn the synthesis mapping from one MR modality to another. Also, the designed GANs model uses the Unet-like generator so that both of the local object content and the whole image context from the given images can be seized in a larger 3D scope for better synthesis. The proposed 3D GAN model is demonstrated to be superior over the general 2D GANs on a public MR image dataset.
Yu, Biting, Per-voxel Prediction on Medical Images via Deep Neural Networks, Doctor of Philosophy thesis, School of Computing and Information Technology, University of Wollongong, 2020. https://ro.uow.edu.au/theses1/949
Unless otherwise indicated, the views expressed in this thesis are those of the author and do not necessarily represent the views of the University of Wollongong.