Pixel Difference Convolutional Network for RGB-D Semantic Segmentation

by   Jun Yang, et al.

RGB-D semantic segmentation can be advanced with convolutional neural networks due to the availability of Depth data. Although objects cannot be easily discriminated by just the 2D appearance, with the local pixel difference and geometric patterns in Depth, they can be well separated in some cases. Considering the fixed grid kernel structure, CNNs are limited to lack the ability to capture detailed, fine-grained information and thus cannot achieve accurate pixel-level semantic segmentation. To solve this problem, we propose a Pixel Difference Convolutional Network (PDCNet) to capture detailed intrinsic patterns by aggregating both intensity and gradient information in the local range for Depth data and global range for RGB data, respectively. Precisely, PDCNet consists of a Depth branch and an RGB branch. For the Depth branch, we propose a Pixel Difference Convolution (PDC) to consider local and detailed geometric information in Depth data via aggregating both intensity and gradient information. For the RGB branch, we contribute a lightweight Cascade Large Kernel (CLK) to extend PDC, namely CPDC, to enjoy global contexts for RGB data and further boost performance. Consequently, both modal data's local and global pixel differences are seamlessly incorporated into PDCNet during the information propagation process. Experiments on two challenging benchmark datasets, i.e., NYUDv2 and SUN RGB-D reveal that our PDCNet achieves state-of-the-art performance for the semantic segmentation task.


page 1

page 5

page 7

page 8

page 9


Global-Local Propagation Network for RGB-D Semantic Segmentation

Depth information matters in RGB-D semantic segmentation task for provid...

Depth-aware CNN for RGB-D Segmentation

Convolutional neural networks (CNN) are limited by the lack of capabilit...

LabelBank: Revisiting Global Perspectives for Semantic Segmentation

Semantic segmentation requires a detailed labeling of image pixels by ob...

ShapeConv: Shape-aware Convolutional Layer for Indoor RGB-D Semantic Segmentation

RGB-D semantic segmentation has attracted increasing attention over the ...

Spatial Information Guided Convolution for Real-Time RGBD Semantic Segmentation

3D spatial information is known to be beneficial to the semantic segment...

Dense RGB-D semantic mapping with Pixel-Voxel neural network

For intelligent robotics applications, extending 3D mapping to 3D semant...

SpiderMesh: Spatial-aware Demand-guided Recursive Meshing for RGB-T Semantic Segmentation

For semantic segmentation in urban scene understanding, RGB cameras alon...

Please sign up or login with your details

Forgot password? Click here to reset