Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:SonarT165: A Large-scale Benchmark and STFTrack Framework for Acoustic Object Tracking

Apr 22, 2025

Yunfeng Li, Bo Wang, Jiahao Wan, Xueyi Wu, Ye Li

Figure 1 for SonarT165: A Large-scale Benchmark and STFTrack Framework for Acoustic Object Tracking

Figure 2 for SonarT165: A Large-scale Benchmark and STFTrack Framework for Acoustic Object Tracking

Figure 3 for SonarT165: A Large-scale Benchmark and STFTrack Framework for Acoustic Object Tracking

Figure 4 for SonarT165: A Large-scale Benchmark and STFTrack Framework for Acoustic Object Tracking

Share this with someone who'll enjoy it:

Abstract:Underwater observation systems typically integrate optical cameras and imaging sonar systems. When underwater visibility is insufficient, only sonar systems can provide stable data, which necessitates exploration of the underwater acoustic object tracking (UAOT) task. Previous studies have explored traditional methods and Siamese networks for UAOT. However, the absence of a unified evaluation benchmark has significantly constrained the value of these methods. To alleviate this limitation, we propose the first large-scale UAOT benchmark, SonarT165, comprising 165 square sequences, 165 fan sequences, and 205K high-quality annotations. Experimental results demonstrate that SonarT165 reveals limitations in current state-of-the-art SOT trackers. To address these limitations, we propose STFTrack, an efficient framework for acoustic object tracking. It includes two novel modules, a multi-view template fusion module (MTFM) and an optimal trajectory correction module (OTCM). The MTFM module integrates multi-view feature of both the original image and the binary image of the dynamic template, and introduces a cross-attention-like layer to fuse the spatio-temporal target representations. The OTCM module introduces the acoustic-response-equivalent pixel property and proposes normalized pixel brightness response scores, thereby suppressing suboptimal matches caused by inaccurate Kalman filter prediction boxes. To further improve the model feature, STFTrack introduces a acoustic image enhancement method and a Frequency Enhancement Module (FEM) into its tracking pipeline. Comprehensive experiments show the proposed STFTrack achieves state-of-the-art performance on the proposed benchmark. The code is available at https://github.com/LiYunfengLYF/SonarT165.

View paper on

Share this with someone who'll enjoy it:

Title:SonarT165: A Large-scale Benchmark and STFTrack Framework for Acoustic Object Tracking

Paper and Code