Multi-Resolution 3D Convolutional Neural Networks for Object Recognition

05/30/2018
by   Sambit Ghadai, et al.
2

Learning from 3D Data is a fascinating idea which is well explored and studied in computer vision. This allows one to learn from very sparse LiDAR data, point cloud data as well as 3D objects in terms of CAD models and surfaces etc. Most of the approaches to learn from such data are limited to uniform 3D volume occupancy grids or octree representations. A major challenge in learning from 3D data is that one needs to define a proper resolution to represent it in a voxel grid and this becomes a bottleneck for the learning algorithms. Specifically, while we focus on learning from 3D data, a fine resolution is very important to capture key features in the object and at the same time the data becomes sparser as the resolution becomes finer. There are numerous applications in computer vision where a multi-resolution representation is used instead of a uniform grid representation in order to make the applications memory efficient. Though such methods are difficult to learn from, they are much more efficient in representing 3D data. In this paper, we explore the challenges in learning from such data representation. In particular, we use a multi-level voxel representation where we define a coarse voxel grid that contains information of important voxels(boundary voxels) and multiple fine voxel grids corresponding to each significant voxel of the coarse grid. A multi-level voxel representation can capture important features in the 3D data in a memory efficient way in comparison to an octree representation. Consequently, learning from a 3D object with high resolution, which is paramount in feature recognition, is made efficient.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset