Abstract:Polyp segmentation plays a pivotal role in colorectal cancer diagnosis. Recently, the emergence of the Segment Anything Model (SAM) has introduced unprecedented potential for polyp segmentation, leveraging its powerful pre-training capability on large-scale datasets. However, due to the domain gap between natural and endoscopy images, SAM encounters two limitations in achieving effective performance in polyp segmentation. Firstly, its Transformer-based structure prioritizes global and low-frequency information, potentially overlooking local details, and introducing bias into the learned features. Secondly, when applied to endoscopy images, its poor out-of-distribution (OOD) performance results in substandard predictions and biased confidence output. To tackle these challenges, we introduce a novel approach named Augmented SAM for Polyp Segmentation (ASPS), equipped with two modules: Cross-branch Feature Augmentation (CFA) and Uncertainty-guided Prediction Regularization (UPR). CFA integrates a trainable CNN encoder branch with a frozen ViT encoder, enabling the integration of domain-specific knowledge while enhancing local features and high-frequency details. Moreover, UPR ingeniously leverages SAM's IoU score to mitigate uncertainty during the training procedure, thereby improving OOD performance and domain generalization. Extensive experimental results demonstrate the effectiveness and utility of the proposed method in improving SAM's performance in polyp segmentation. Our code is available at https://github.com/HuiqianLi/ASPS.
Abstract:This paper introduces the autonomous system of the "Smart Shark II" which won the Formula Student Autonomous China (FSAC) Competition in 2018. In this competition, an autonomous racecar is required to complete autonomously two laps of unknown track. In this paper, the author presents the self-driving software structure of this racecar which ensure high vehicle speed and safety. The key components ensure a stable driving of the racecar, LiDAR-based and Vision-based cone detection provide a redundant perception; the EKF-based localization offers high accuracy and high frequency state estimation; perception results are accumulated in time and space by occupancy grid map. After getting the trajectory, a model predictive control algorithm is used to optimize in both longitudinal and lateral control of the racecar. Finally, the performance of an experiment based on real-world data is shown.