Detector-free local feature matching methods have demonstrated significant performance improvements since leveraging the power of Transformer architecture. The global receptive field provided by Transformers allows for simultaneous interaction among all elements, proving particularly beneficial in regions with low texture or repetitive patterns. However, Transformer-based methods encounter the bottleneck in achieving a balance between computational cost and expressive efficacy when dealing with numerous patch-level features. In this work, we revisit the existing detector-free methods and propose EcoMatcher, a universal matcher based on implicit clustering, called Context Clusters. By introducing coarser-grained features as clustering centers, similar patch-level features are allocated to the same center, forming different clustering patterns. Features within the same cluster are then dispatched with identical messages from their center but at varying scales depending on similarity. This process defines a novel feature extraction paradigm for both self-understanding and cross-interaction of image pairs, aiding in fusing multi-level features and reducing the overall complexity. EcoMatcher is a competitive detector-free method in terms of memory demand and runtime speed, and also achieves strong performance on both indoor and outdoor mainstream benchmarks.
Live content is unavailable. Log in and register to view live content