DVONet: Unsupervised Monocular Depth Estimation and Visual Odometry
2019 IEEE. This paper proposes an unsupervised learning framework for monocular depth estimation and visual odometry (VO), referred to as DVONet. The framework is trained using stereo image sequences and is able to estimate absolute-scale scene depth and camera poses from monocular images. To mitigate the effect of stereo occlusions in training and improve the depth estimation, left-right occlusion mask is introduced. In addition, a novel VO network is proposed where the feature extraction network is shared between pose estimation and optical flow estimation. The proposed DVONet achieves state-of-The-Art results for both depth estimation and VO tasks on the KITTI driving dataset, outperforming the existing unsupervised methods and being comparable to the traditional ones.