Skip to yearly menu bar Skip to main content


Poster

Coarse-to-Fine Implicit Representation Learning for 3D Hand-Object Reconstruction from a Single RGB-D Image

Xingyu Liu · Pengfei Ren · Jingyu Wang · Qi Qi · Haifeng Sun · Zirui Zhuang · Jianxin Liao

Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ]
Wed 2 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

Recent research has explored implicit representations, such as signed distance function (SDF), for interacting hand-object reconstruction. SDF enables modeling hand-held objects with arbitrary topology and overcomes the resolution limitations of parametric models, allowing for finer-grained reconstruction. However, modeling elaborate SDF directly based on visual features faces challenges due to depth ambiguity and appearance similarity, especially in cluttered real-world scenes. In this paper, we propose a coarse-to-fine SDF framework for 3D hand-object reconstruction, which leverages the perceptual advantages of RGB-D modality in both visual and geometric aspects, to progressively model the implicit field. Initially, we model coarse-level SDF using global image features to achieve a holistic perception of 3D scenes. Subsequently, we propose a 3D Point-Aligned Implicit Function (3D PIFu) for fine-level SDF learning, which leverages the local geometric clues of the point cloud to capture intricate details. To facilitate the transition from coarse to fine, we extract hand-object semantics from the implicit field as prior knowledge. Additionally, we propose a surface-aware efficient reconstruction strategy that sparsely samples query points based on the hand-object semantic prior. Experiments on two challenging hand-object datasets show that our method outperforms existing methods by a large margin.

Live content is unavailable. Log in and register to view live content