Abstract:Analyzing ultrasonic vocalizations (USVs) is crucial for understanding rodents' affective states and social behaviors, but the manual analysis is time-consuming and prone to errors. Automated USV detection systems have been developed to address these challenges. Yet, these systems often rely on machine learning and fail to generalize effectively to new datasets. To tackle these shortcomings, we introduce ContourUSV, an efficient automated system for detecting USVs from audio recordings. Our pipeline includes spectrogram generation, cleaning, pre-processing, contour detection, post-processing, and evaluation against manual annotations. To ensure robustness and reliability, we compared ContourUSV with three state-of-the-art systems using an existing open-access USV dataset (USVSEG) and a second dataset we are releasing publicly along with this paper. On average, across the two datasets, ContourUSV outperformed the other three systems with a 1.51x improvement in precision, 1.17x in recall, 1.80x in F1 score, and 1.49x in specificity while achieving an average speedup of 117.07x.