Poster
SMFANet: A Lightweight Self-Modulation Feature Aggregation Network for Efficient Image Super-Resolution
mingjun zheng · Long Sun · Jiangxin Dong · Jinshan Pan
| Strong Double Blind | 
                                
                
                [
                 Slides] 
                
            
                
                [
                 Supplemental] 
                
            
                            
                            2024 Poster
                            
                        
                        
                    
                        Abstract:
                        
                            Transformer-based restoration methods achieve significant performance as the self-attention (SA) of the Transformer can explore non-local information for better high-resolution image reconstruction. However, the key dot-product SA requires substantial computational resources, which limits its application in low-power devices. Moreover, the low-pass nature of the SA mechanism limits its capacity for capturing local details, consequently leading to smooth reconstruction results.To address these issues, we propose a self-modulation feature aggregation (SMFA) module to collaboratively exploit both local and non-local feature interactions for a more accurate reconstruction. Specifically, the SMFA module employs an efficient approximation of self-attention (EASA) branch to model non-local information and uses a local detail estimation (LDE) branch to capture local details.Additionally, we further introduce a partial convolution-based feed-forward network (PCFN) to refine the representative features derived from the SMFA. Extensive experiments show that the proposed SMFANet family achieves a better trade-off between reconstruction performance and computational efficiency on public benchmark datasets.In particular, compared to the $\times$4 SwinIR-light, SMFANet+ achieves \textbf{0.14dB} higher performance over five public testsets on average,and \textbf{$\times$10} times faster runtime, with only about \textbf{43\%} of the model complexity (\textit{e.g.,} FLOPs).
                        
                    
                    
                Chat is not available.