This study proposes a lightweight method for building image super-resolution using a Dilated Contextual Feature Modulation Network (DCFMN). The process includes obtaining high-resolution images, down-sampling them to low-resolution, enhancing the low-resolution images, constructing and training a lightweight network model, and generating super-resolution outputs. To address challenges such as regular textures and long-range dependencies in building images, the DCFMN integrates an expansion separable modulation unit and a local feature enhancement module. The former employs multiple expansion convolutions equivalent to a large kernel to efficiently aggregate multi-scale features while leveraging a simple attention mechanism for adaptivity. The latter encodes local features, mixes channel information, and ensures no additional computational burden during inference through reparameterization. This approach effectively resolves the limitations of existing lightweight super-resolution networks in modeling long-range dependencies, achieving accurate and efficient global feature modeling without increasing computational costs, and significantly improving both reconstruction quality and lightweight efficiency for building image super-resolution models.