Skip to yearly menu bar Skip to main content


Poster

Frequency-Spatial Entanglement Learning for Camouflaged Object Detection

Yanguang Sun · Chunyan Xu · Jian Yang · Hanyu Xuan · Lei Luo

Strong blind review: This paper was not made available on public preprint services during the review process Strong Double Blind
[ ]
Thu 3 Oct 1:30 a.m. PDT — 3:30 a.m. PDT

Abstract:

Camouflaged object detection (COD) has attracted a lot of attention in computer vision. The main challenge lies in the high degree of similarity between camouflaged objects and their surroundings in the spatial domain, making identification difficult. Existing methods attempt to reduce the impact of pixel similarity by maximizing the distinguishing ability of spatial features with complicated design, but often ignore the sensitivity and locality of features in the spatial domain, leading to sub-optimal results. In this paper, we propose a new approach to address this issue by jointly exploring the representation in the frequency and spatial domains, introducing the Frequency-Spatial Entanglement Learning (FSEL) method. This method consists of a series of well-designed Entanglement Transformer Blocks (ETB) for representation learning, a Joint Domain Perception Module (JDPM) for semantic enhancement, and a Dual-domain Reverse Parser (DRF) for feature integration in the frequency and spatial domains. Specifically, the ETB utilizes frequency self-attention (FSA) to effectively characterize the relationship between different frequency bands, while the entanglement feed-forward network (EFFN) facilitates information interaction between features of different domains through entanglement learning. Our extensive experiments demonstrate the superiority of FSEL over 21 state-of-the-art (SOTA) methods, through comprehensive quantitative and qualitative comparisons in three widely-used COD datasets.

Live content is unavailable. Log in and register to view live content