NU-MCC: Multiview Compressive Coding with Neighborhood Decoder and Repulsive UDF

by   Stefan Lionar, et al.

Remarkable progress has been made in 3D reconstruction from single-view RGB-D inputs. MCC is the current state-of-the-art method in this field, which achieves unprecedented success by combining vision Transformers with large-scale training. However, we identified two key limitations of MCC: 1) The Transformer decoder is inefficient in handling large number of query points; 2) The 3D representation struggles to recover high-fidelity details. In this paper, we propose a new approach called NU-MCC that addresses these limitations. NU-MCC includes two key innovations: a Neighborhood decoder and a Repulsive Unsigned Distance Function (Repulsive UDF). First, our Neighborhood decoder introduces center points as an efficient proxy of input visual features, allowing each query point to only attend to a small neighborhood. This design not only results in much faster inference speed but also enables the exploitation of finer-scale visual features for improved recovery of 3D textures. Second, our Repulsive UDF is a novel alternative to the occupancy field used in MCC, significantly improving the quality of 3D object reconstruction. Compared to standard UDFs that suffer from holes in results, our proposed Repulsive UDF can achieve more complete surface reconstruction. Experimental results demonstrate that NU-MCC is able to learn a strong 3D representation, significantly advancing the state of the art in single-view 3D reconstruction. Particularly, it outperforms MCC by 9.7 F1-score on the CO3D-v2 dataset with more than 5x faster running speed.


page 2

page 8

page 9

page 13

page 14

page 15


3D-RETR: End-to-End Single and Multi-View 3D Reconstruction with Transformers

3D reconstruction aims to reconstruct 3D objects from 2D views. Previous...

TaylorImNet for Fast 3D Shape Reconstruction Based on Implicit Surface Function

Benefiting from the contiguous representation ability, deep implicit fun...

Learning Anchored Unsigned Distance Functions with Gradient Direction Alignment for Single-view Garment Reconstruction

While single-view 3D reconstruction has made significant progress benefi...

High-fidelity 3D Model Compression based on Key Spheres

In recent years, neural signed distance function (SDF) has become one of...

Multiview Compressive Coding for 3D Reconstruction

A central goal of visual recognition is to understand objects and scenes...

NeReF: Neural Refractive Field for Fluid Surface Reconstruction and Implicit Representation

Existing neural reconstruction schemes such as Neural Radiance Field (Ne...

3D Shape Reconstruction from Vision and Touch

When a toddler is presented a new toy, their instinctual behaviour is to...

Please sign up or login with your details

Forgot password? Click here to reset