Poster
Rethinking Features-Fused-Pyramid-Neck for Object Detection
Hulin Li
# 104
Strong Double Blind |
Multi-head detectors are widely used in the industry for multi-scale detection. These detectors typically employ a convention of using a features-fused-pyramid-neck. However, the features-fused-pyramid-neck encounters feature misalignment when representations from different level hierarchies are forcibly fused point-to-point. To address this issue, we propose an independent hierarchy pyramid (IHP) architecture to evaluate the effectiveness of the features-unfused-pyramid-neck for multi-head detectors. Subsequently, we introduce a soft nearest neighbor interpolation (SNI) to address feature misalignment, utilizing a weight-downscaling factor to minimize the fusion impact on different level features. Furthermore, we introduce spatial window extension for down-sampling (ESD) to retain spatial features and propose an enhanced lightweight convolutional technique. Finally, building upon the above advancements, we design a secondary features alignment solution (SANs) for real-time detection and achieve state-of-the-art results on the Pascal VOC and MS COCO. Code will be released soon.