DVONet: Unsupervised Monocular Depth Estimation and Visual Odometry



Publication Details

Li, X., Hou, Y., Wu, Q., Wang, P. & Li, W. (2019). DVONet: Unsupervised Monocular Depth Estimation and Visual Odometry. 2019 IEEE International Conference on Visual Communications and Image Processing, VCIP 2019


2019 IEEE. This paper proposes an unsupervised learning framework for monocular depth estimation and visual odometry (VO), referred to as DVONet. The framework is trained using stereo image sequences and is able to estimate absolute-scale scene depth and camera poses from monocular images. To mitigate the effect of stereo occlusions in training and improve the depth estimation, left-right occlusion mask is introduced. In addition, a novel VO network is proposed where the feature extraction network is shared between pose estimation and optical flow estimation. The proposed DVONet achieves state-of-The-Art results for both depth estimation and VO tasks on the KITTI driving dataset, outperforming the existing unsupervised methods and being comparable to the traditional ones.

Please refer to publisher version or contact your library.



Link to publisher version (DOI)