Skip to yearly menu bar Skip to main content


Poster

PoseSOR: Human Pose Can Guide Our Attention

Huankang Guan · Rynson W.H. Lau

Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ] [ Project Page ]
Thu 3 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

Salient Object Ranking (SOR) aims to study how humans shift their attention among various objects within a scene. Previous works attempt to excavate explicit visual saliency cues, e.g., spatial frequency and semantic context, to tackle this challenge. However, these visual saliency cues may fall short in handling real-world scenarios, which often involve various human activities and interactions. We observe that human observers' attention can be reflexively guided by the poses and gestures of the people in the scene, which indicate their activities. For example, observers tend to shift their attention to follow others' head orientation or running/walking direction to anticipate what will happen. Inspired by this observation, we propose to explore the human skeletal pose to deeply understand high-level interactions between human participants and their surroundings for robust salient object ranking. Specifically, we propose PoseSOR, a human pose-aware SOR model for the SOR task, with two novel modules: 1) a Pose-Aware Interaction (PAI) Module to integrate human pose knowledge into salient object queries for learning high-level interactions, and 2) a Pose-Driven Ranking (PDR) Module to apply pose knowledge as directional cues to help predict where human attention will shift to. To our knowledge, our approach is the first to explore human pose for salient object ranking. Extensive experiments demonstrate the effectiveness of our method for SOR, particularly on complex scenes, and our model sets the new state-of-the-art on the SOR benchmarks. The code will be made publicly available.

Live content is unavailable. Log in and register to view live content