https://github.com/ZhiyuanLi218/segret.
With the ongoing advancement of autonomous driving technology and intelligent transportation systems, research into semantic segmentation has become increasingly pivotal. Accurate understanding and analysis of real-world scenarios are now essential for these emerging fields. However, traditional semantic segmentation methods often struggle to balance high model accuracy with computational efficiency, particularly in terms of parameter count. To address this challenge, we introduce SegRet, a novel approach that leverages the Retentive Network (RetNet) architecture and integrates a lightweight residual decoder featuring zero-initialization. SegRet exhibits three key characteristics: (1) Lightweight Residual Decoder: We incorporate a zero-initialization layer within the residual network framework, ensuring that the decoder remains computationally efficient while preserving critical information flow; (2) Robust Feature Extraction: Utilizing RetNet as the backbone, our model adeptly extracts hierarchical features from input images, thereby enhancing the depth and breadth of feature representation; (3) Parameter Efficiency: SegRet achieves state-of-the-art performance while significantly reducing the number of parameters, maintaining high accuracy without compromising on computational resources. Empirical evaluations on benchmark datasets such as ADE20K, Cityscapes, and COCO-Stuff10K demonstrate the efficacy of our approach. SegRet delivers impressive results, achieving an mIoU of 52.23\% on ADE20K with only 95.81M parameters, 83.36\% on Cityscapes, and 46.63\% on COCO-Stuff. The code is available at: