Abstract:Synthetic aperture radar (SAR) has been extensively utilized in maritime domains due to its all-weather, all-day monitoring capabilities, particularly exhibiting significant value in ship detection. In recent years, deep learning methods have increasingly been utilized for refined ship detection. However, learning-based methods exhibit poor generalization when confronted with new scenarios and data, necessitating expert intervention for continuous annotation. Currently, the degree of automation in human-machine collaboration within this field, especially in annotating new data, is not high, leading to labor- and computation-intensive model iteration and updates. Addressing these issues, a ship detection framework in SAR images with human-in-the-loop (HitL) is proposed. Incorporating the concept of HitL, tailored active learning strategies are designed for SAR ship detection tasks to present valuable samples to users, and an interactive human-machine interface (HMI) is established to efficiently collect user feedback. Consequently, user input is utilized in each interaction round to enhance model performance. Employing the proposed framework, an annotated ship database of SAR images is constructed, and the iteration experiments conducted during the construction demonstrates the efficiency of the method, providing new perspectives and approaches for research in this domain.
Abstract:In the realm of artificial intelligence, the emergence of foundation models, backed by high computing capabilities and extensive data, has been revolutionary. Segment Anything Model (SAM), built on the Vision Transformer (ViT) model with millions of parameters and vast training dataset SA-1B, excels in various segmentation scenarios relying on its significance of semantic information and generalization ability. Such achievement of visual foundation model stimulates continuous researches on specific downstream tasks in computer vision. The ClassWise-SAM-Adapter (CWSAM) is designed to adapt the high-performing SAM for landcover classification on space-borne Synthetic Aperture Radar (SAR) images. The proposed CWSAM freezes most of SAM's parameters and incorporates lightweight adapters for parameter efficient fine-tuning, and a classwise mask decoder is designed to achieve semantic segmentation task. This adapt-tuning method allows for efficient landcover classification of SAR images, balancing the accuracy with computational demand. In addition, the task specific input module injects low frequency information of SAR images by MLP-based layers to improve the model performance. Compared to conventional state-of-the-art semantic segmentation algorithms by extensive experiments, CWSAM showcases enhanced performance with fewer computing resources, highlighting the potential of leveraging foundational models like SAM for specific downstream tasks in the SAR domain. The source code is available at: https://github.com/xypu98/CWSAM.
Abstract:The electromagnetic inverse problem has long been a research hotspot. This study aims to reverse radar view angles in synthetic aperture radar (SAR) images given a target model. Nonetheless, the scarcity of SAR data, combined with the intricate background interference and imaging mechanisms, limit the applications of existing learning-based approaches. To address these challenges, we propose an interactive deep reinforcement learning (DRL) framework, where an electromagnetic simulator named differentiable SAR render (DSR) is embedded to facilitate the interaction between the agent and the environment, simulating a human-like process of angle prediction. Specifically, DSR generates SAR images at arbitrary view angles in real-time. And the differences in sequential and semantic aspects between the view angle-corresponding images are leveraged to construct the state space in DRL, which effectively suppress the complex background interference, enhance the sensitivity to temporal variations, and improve the capability to capture fine-grained information. Additionally, in order to maintain the stability and convergence of our method, a series of reward mechanisms, such as memory difference, smoothing and boundary penalty, are utilized to form the final reward function. Extensive experiments performed on both simulated and real datasets demonstrate the effectiveness and robustness of our proposed method. When utilized in the cross-domain area, the proposed method greatly mitigates inconsistency between simulated and real domains, outperforming reference methods significantly.