ECCV Poster LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation

Poster

LaPose: Laplacian Mixture Shape Modeling for RGB-Based Category-Level Object Pose Estimation

Ruida Zhang · Ziqin Huang · Gu Wang · Chenyangguang Zhang · Yan Di · Xingxing Zuo · Jiwen Tang · Xiangyang Ji

Strong blind review: This paper was not made available on public preprint services during the review process

Strong Double Blind

[ Abstract ] [ Paper PDF ]

[ Supplemental]

2024 Poster

Abstract:

While RGBD-based methods for category-level object pose estimation hold promise, their reliance on depth data limits their applicability in diverse scenarios. In response, recent efforts have turned to RGB-based methods; however, they face significant challenges stemming from the absence of depth information. On one hand, the lack of depth exacerbates the difficulty in handling intra-class shape variation, resulting in increased uncertainty in shape predictions. On the other hand, RGB-only inputs introduce inherent scale ambiguity, rendering the estimation of object size and translation an ill-posed problem. To tackle these challenges, we propose LaPose, a novel framework that models the object shape as the Laplacian mixture model for Pose estimation. By representing each point as a probabilistic distribution, we explicitly quantify the shape uncertainty. LaPose leverages both a generalized 3D information stream and a specialized feature stream to independently predict the Laplacian distribution for each point, capturing different aspects of object geometry. These two distributions are then integrated as a Laplacian mixture model to establish the 2D-3D correspondences, which are utilized to solve the pose via the PnP module. In order to mitigate scale ambiguity, we introduce a scale-agnostic representation for object size and translation, enhancing training efficiency and overall robustness. Extensive experiments on the NOCS datasets validate the effectiveness of LaPose, yielding state-of-the-art performance in RGB-based category-level object pose estimation. Codes and trained models will be released.

Chat is not available.