Abstract:The detection of small objects is a challenging task in computer vision. Conventional object detection methods have difficulty in finding the balance between high detection and low false alarm rates. In the literature, some methods have addressed this issue by enhancing the feature map responses, but without guaranteeing robustness with respect to the number of false alarms induced by background elements. To tackle this problem, we introduce an $\textit{a contrario}$ decision criterion into the learning process to take into account the unexpectedness of small objects. This statistic criterion enhances the feature map responses while controlling the number of false alarms (NFA) and can be integrated into any semantic segmentation neural network. Our add-on NFA module not only allows us to obtain competitive results for small target and crack detection tasks respectively, but also leads to more robust and interpretable results.
Abstract:Small target detection is an essential yet challenging task in defense applications, since differentiating low-contrast targets from natural textured and noisy environment remains difficult. To better take into account the contextual information, we propose to explore deep learning approaches based on attention mechanisms. Specifically, we propose a customized version of TransUnet including channel attention, which has shown a significant improvement in performance. Moreover, the lack of annotated data induces weak detection precision, leading to many false alarms. We thus explore a contrario methods in order to select meaningful potential targets detected by a weak deep learning training. -- La d\'etection de petites cibles est une probl\'ematique d\'elicate mais essentielle dans le domaine de la d\'efense, notamment lorsqu'il s'agit de diff\'erencier ces cibles d'un fond bruit\'e ou textur\'e, ou lorsqu'elles sont de faible contraste. Pour mieux prendre en compte les informations contextuelles, nous proposons d'explorer diff\'erentes approches de segmentation par apprentissage profond, dont certaines bas\'ees sur les m\'ecanismes d'attention. Nous proposons \'egalement d'inclure un module d'attention par canal au TransUnet, r\'eseau \`a l'\'etat de l'art, ce qui permet d'am\'eliorer significativement les performances. Par ailleurs, le manque de donn\'ees annot\'ees induit une perte en pr\'ecision lors des d\'etections, conduisant \`a de nombreuses fausses alarmes non pertinentes. Nous explorons donc des m\'ethodes a contrario afin de s\'electionner les cibles les plus significatives d\'etect\'ees par un r\'eseau entra\^in\'e avec peu de donn\'ees.