ChiTransformer:Towards Reliable Stereo from Cues

03/09/2022
by   Qing Su, et al.
0

Current stereo matching techniques are challenged by restricted searching space, occluded regions, and sheer size. While single image depth estimation is spared from these challenges and can achieve satisfactory results with the extracted monocular cues, the lack of stereoscopic relationship renders the monocular prediction less reliable on its own, especially in highly dynamic or cluttered environments. To address these issues in both scenarios, we present an optic-chiasm-inspired self-supervised binocular depth estimation method, wherein vision transformer (ViT) with a gated positional cross-attention (GPCA) layer is designed to enable feature-sensitive pattern retrieval between views while retaining the extensive context information aggregated through self-attentions. Monocular cues from a single view are thereafter conditionally rectified by a blending layer with the retrieved pattern pairs. This crossover design is biologically analogous to the optic-chasma structure in human visual system and hence the name, ChiTransformer. Our experiments show that this architecture yields substantial improvements over state-of-the-art self-supervised stereo approaches by 11 and non-rectilinear (e.g., fisheye) images.

READ FULL TEXT

page 7

page 8

research
06/17/2020

Self-Supervised Joint Learning Framework of Depth Estimation via Implicit Cues

In self-supervised monocular depth estimation, the depth discontinuity a...
research
05/22/2023

Gated Stereo: Joint Depth Estimation from Gated and Wide-Baseline Active Stereo Cues

We propose Gated Stereo, a high-resolution and long-range depth estimati...
research
08/25/2020

MonStereo: When Monocular and Stereo Meet at the Tail of 3D Human Localization

Monocular and stereo vision are cost-effective solutions for 3D human lo...
research
11/29/2017

Deep Eyes: Binocular Depth-from-Focus on Focal Stack Pairs

Human visual system relies on both binocular stereo cues and monocular f...
research
03/20/2018

Fusion of stereo and still monocular depth estimates in a self-supervised learning context

We study how autonomous robots can learn by themselves to improve their ...
research
12/12/2022

ROIFormer: Semantic-Aware Region of Interest Transformer for Efficient Self-Supervised Monocular Depth Estimation

The exploration of mutual-benefit cross-domains has shown great potentia...
research
08/17/2022

Self-Supervised Depth Estimation in Laparoscopic Image using 3D Geometric Consistency

Depth estimation is a crucial step for image-guided intervention in robo...

Please sign up or login with your details

Forgot password? Click here to reset