Vision transformers (ViTs) encoding an image as a sequence of patches br...
Semantic segmentation is a critical technology for autonomous vehicles t...
Online multi-object tracking (MOT) is an active research topic in the do...
Monocular 3D object detection is a promising research topic for the
inte...