University of Wollongong
Browse

Modeling Long-range Dependencies and Epipolar Geometry for Multi-view Stereo

journal contribution
posted on 2024-11-17, 15:34 authored by Jie Zhu, Bo Peng, Wanqing Li, Haifeng Shen, Qingming Huang, Jianjun Lei
This article proposes a network, referred to as Multi-View Stereo TRansformer (MVSTR) for depth estimation from multi-view images. By modeling long-range dependencies and epipolar geometry, the proposed MVSTR is capable of extracting dense features with global context and 3D consistency, which are crucial for reliable matching in multi-view stereo (MVS). Specifically, to tackle the problem of the limited receptive field of existing CNN-based MVS methods, a global-context Transformer module is designed to establish intra-view long-range dependencies so that global contextual features of each view are obtained. In addition, to further enable features of each view to be 3D consistent, a 3D-consistency Transformer module with an epipolar feature sampler is built, where epipolar geometry is modeled to effectively facilitate cross-view interaction. Experimental results show that the proposed MVSTR achieves the best overall performance on the DTU dataset and demonstrates strong generalization on the Tanks & Temples benchmark dataset.

Funding

National Natural Science Foundation of China (61931014)

History

Journal title

ACM Transactions on Multimedia Computing, Communications and Applications

Volume

19

Issue

6

Publisher website/DOI

Language

English

Usage metrics

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC