Abstract:Squeeze-and-Excitation (SE) block presents a channel attention mechanism for modeling the global context via explicitly capturing dependencies between channels. However, we still poorly understand for SE block. In this work, we first revisit the SE block and present a detailed empirical study of the relationship between global context and attention distribution, based on which we further propose a simple yet effective module. We call this module Linear Context Transform (LCT) block, which implicitly captures dependencies between channels and linearly transforms the global context of each channel. LCT block is extremely lightweight with negligible parameters and computations. Extensive experiments show that LCT block outperforms SE block in image classification on ImageNet and object detection/segmentation on COCO across many models. Moreover, we also demonstrate that LCT block can yield consistent performance gains for existing state-of-the-art detection architectures. For examples, LCT block brings 1.5$\sim$1.7% AP$^{bbox}$ and 1.0%$\sim$1.2% AP$^{mask}$ gains independently of the detector strength on COCO benchmark. We hope our work will provide a new insight into the channel attention