Skip to yearly menu bar Skip to main content


Poster

Better Call SAL: Towards Learning to Segment Anything in Lidar

Aljosa Osep · Tim Meinhardt · Francesco Ferroni · Neehar Peri · Deva Ramanan · Laura Leal-TaixĂ©

[ ]
Fri 4 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

We propose SAL (Segment Anything in Lidar), a text-promptable zero-shot model for segmenting and classifying any object in Lidar, and a pseudo-labeling engine that facilitates model training without manual supervision. While the established paradigm to Lidar Panoptic Segmentation (LPS) relies on manual supervision for a handful of object classes, defined a priori, we lean on Vision Foundation Models to generate supervision ``for free'' in the form of instance masks and corresponding localized text embeddings, which we distill to Lidar using calibrated multi-modal data. Even though our model is solely trained using self-generated pseudo-labels, SAL achieves 91% of the supervised model performance in terms of class-agnostic segmentation and 44% in terms of zero-shot LPS on standard LPS datasets, and outperforms baselines that directly lift image features to 3D. More importantly, we show SAL supports arbitrary class prompts, can be easily extended new datasets, and shows a significant potential to improve with increased amount of self-labeled data.

Live content is unavailable. Log in and register to view live content