PointNeXt: Revisiting PointNet++ with Improved Training and Scaling Strategies

06/09/2022
by   Guocheng Qian, et al.
19

PointNet++ is one of the most influential neural architectures for point cloud understanding. Although the accuracy of PointNet++ has been largely surpassed by recent networks such as PointMLP and Point Transformer, we find that a large portion of the performance gain is due to improved training strategies, i.e. data augmentation and optimization techniques, and increased model sizes rather than architectural innovations. Thus, the full potential of PointNet++ has yet to be explored. In this work, we revisit the classical PointNet++ through a systematic study of model training and scaling strategies, and offer two major contributions. First, we propose a set of improved training strategies that significantly improve PointNet++ performance. For example, we show that, without any change in architecture, the overall accuracy (OA) of PointNet++ on ScanObjectNN object classification can be raised from 77.9% to 86.1%, even outperforming state-of-the-art PointMLP. Second, we introduce an inverted residual bottleneck design and separable MLPs into PointNet++ to enable efficient and effective model scaling and propose PointNeXt, the next version of PointNets. PointNeXt can be flexibly scaled up and outperforms state-of-the-art methods on both 3D classification and segmentation tasks. For classification, PointNeXt reaches an overall accuracy of 87.7% on ScanObjectNN, surpassing PointMLP by 2.3%, while being 10 × faster in inference. For semantic segmentation, PointNeXt establishes a new state-of-the-art performance with 74.9% mean IoU on S3DIS (6-fold cross-validation), being superior to the recent Point Transformer. The code and models are available at https://github.com/guochengqian/pointnext.

READ FULL TEXT

page 19

page 20

research
06/09/2021

Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline

Processing point cloud data is an important component of many real-world...
research
10/11/2022

Point Transformer V2: Grouped Vector Attention and Partition-based Pooling

As a pioneering work exploring transformer architecture for 3D point clo...
research
11/23/2020

Scaling Wide Residual Networks for Panoptic Segmentation

The Wide Residual Networks (Wide-ResNets), a shallow but wide model vari...
research
09/03/2021

Revisiting 3D ResNets for Video Recognition

A recent work from Bello shows that training and scaling strategies may ...
research
03/13/2021

Revisiting ResNets: Improved Training and Scaling Strategies

Novel computer vision architectures monopolize the spotlight, but the im...
research
08/11/2022

PointTree: Transformation-Robust Point Cloud Encoder with Relaxed K-D Trees

Being able to learn an effective semantic representation directly on raw...
research
09/03/2022

Training Strategies for Improved Lip-reading

Several training strategies and temporal models have been recently propo...

Please sign up or login with your details

Forgot password? Click here to reset