Abstract:Capturing different intensity and directions of light rays at the same scene Light field (LF) can encode the 3D scene cues into a 4D LF image which has a wide range of applications (i.e. post-capture refocusing and depth sensing). LF image super-resolution (SR) aims to improve the image resolution limited by the performance of LF camera sensor. Although existing methods have achieved promising results the practical application of these models is limited because they are not lightweight enough. In this paper we propose a lightweight model named LGFN which integrates the local and global features of different views and the features of different channels for LF image SR. Specifically owing to neighboring regions of the same pixel position in different sub-aperture images exhibit similar structural relationships we design a lightweight CNN-based feature extraction module (namely DGCE) to extract local features better through feature modulation. Meanwhile as the position beyond the boundaries in the LF image presents a large disparity we propose an efficient spatial attention module (namely ESAM) which uses decomposable large-kernel convolution to obtain an enlarged receptive field and an efficient channel attention module (namely ECAM). Compared with the existing LF image SR models with large parameter our model has a parameter of 0.45M and a FLOPs of 19.33G which has achieved a competitive effect. Extensive experiments with ablation studies demonstrate the effectiveness of our proposed method which ranked the second place in the Track 2 Fidelity & Efficiency of NTIRE2024 Light Field Super Resolution Challenge and the seventh place in the Track 1 Fidelity.