UniWorld: Autonomous Driving Pre-training via World Models

08/14/2023
by   Chen Min, et al.
0

In this paper, we draw inspiration from Alberto Elfes' pioneering work in 1989, where he introduced the concept of the occupancy grid as World Models for robots. We imbue the robot with a spatial-temporal world model, termed UniWorld, to perceive its surroundings and predict the future behavior of other participants. UniWorld involves initially predicting 4D geometric occupancy as the World Models for foundational stage and subsequently fine-tuning on downstream tasks. UniWorld can estimate missing information concerning the world state and predict plausible future states of the world. Besides, UniWorld's pre-training process is label-free, enabling the utilization of massive amounts of image-LiDAR pairs to build a Foundational Model.The proposed unified pre-training framework demonstrates promising results in key tasks such as motion prediction, multi-camera 3D object detection, and surrounding semantic scene completion. When compared to monocular pre-training methods on the nuScenes dataset, UniWorld shows a significant improvement of about 1.5 IoU for motion prediction, 2.0 object detection, as well as a 3 scene completion. By adopting our unified pre-training method, a 25 in 3D training annotation costs can be achieved, offering significant practical value for the implementation of real-world autonomous driving. Codes are publicly available at https://github.com/chaytonmin/UniWorld.

READ FULL TEXT
research
05/30/2023

Occ-BEV: Multi-Camera Unified Pre-training via 3D Scene Reconstruction

Multi-camera 3D perception has emerged as a prominent research field in ...
research
09/19/2023

SPOT: Scalable 3D Pre-training via Occupancy Prediction for Autonomous Driving

Annotating 3D LiDAR point clouds for perception tasks including 3D objec...
research
06/01/2023

AD-PT: Autonomous Driving Pre-Training with Large-scale Point Cloud Dataset

It is a long-term vision for Autonomous Driving (AD) community that the ...
research
03/15/2020

MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps

The ability to reliably perceive the environmental states, particularly ...
research
08/17/2021

RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection

3D point cloud understanding has made great progress in recent years. Ho...
research
01/03/2023

Policy Pre-training for End-to-end Autonomous Driving via Self-supervised Geometric Modeling

Witnessing the impressive achievements of pre-training techniques on lar...
research
06/06/2021

A Pre-training Oracle for Predicting Distances in Social Networks

In this paper, we propose a novel method to make distance predictions in...

Please sign up or login with your details

Forgot password? Click here to reset