Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Marco Colussi

Improving Zero-shot ADL Recognition with Large Language Models through Event-based Context and Confidence

Jan 13, 2026

Michele Fiori, Gabriele Civitarese, Marco Colussi, Claudio Bettini

Abstract:Unobtrusive sensor-based recognition of Activities of Daily Living (ADLs) in smart homes by processing data collected from IoT sensing devices supports applications such as healthcare, safety, and energy management. Recent zero-shot methods based on Large Language Models (LLMs) have the advantage of removing the reliance on labeled ADL sensor data. However, existing approaches rely on time-based segmentation, which is poorly aligned with the contextual reasoning capabilities of LLMs. Moreover, existing approaches lack methods for estimating prediction confidence. This paper proposes to improve zero-shot ADL recognition with event-based segmentation and a novel method for estimating prediction confidence. Our experimental evaluation shows that event-based segmentation consistently outperforms time-based LLM approaches on complex, realistic datasets and surpasses supervised data-driven methods, even with relatively small LLMs (e.g., Gemma 3 27B). The proposed confidence measure effectively distinguishes correct from incorrect predictions.

Via

Access Paper or Ask Questions

MIAS-SAM: Medical Image Anomaly Segmentation without thresholding

May 28, 2025

Marco Colussi, Dragan Ahmetovic, Sergio Mascetti

Abstract:This paper presents MIAS-SAM, a novel approach for the segmentation of anomalous regions in medical images. MIAS-SAM uses a patch-based memory bank to store relevant image features, which are extracted from normal data using the SAM encoder. At inference time, the embedding patches extracted from the SAM encoder are compared with those in the memory bank to obtain the anomaly map. Finally, MIAS-SAM computes the center of gravity of the anomaly map to prompt the SAM decoder, obtaining an accurate segmentation from the previously extracted features. Differently from prior works, MIAS-SAM does not require to define a threshold value to obtain the segmentation from the anomaly map. Experimental results conducted on three publicly available datasets, each with a different imaging modality (Brain MRI, Liver CT, and Retina OCT) show accurate anomaly segmentation capabilities measured using DICE score. The code is available at: https://github.com/warpcut/MIAS-SAM

Via

Access Paper or Ask Questions

ReC-TTT: Contrastive Feature Reconstruction for Test-Time Training

Nov 26, 2024

Marco Colussi, Sergio Mascetti, Jose Dolz, Christian Desrosiers

Figure 1 for ReC-TTT: Contrastive Feature Reconstruction for Test-Time Training

Figure 2 for ReC-TTT: Contrastive Feature Reconstruction for Test-Time Training

Figure 3 for ReC-TTT: Contrastive Feature Reconstruction for Test-Time Training

Figure 4 for ReC-TTT: Contrastive Feature Reconstruction for Test-Time Training

Abstract:The remarkable progress in deep learning (DL) showcases outstanding results in various computer vision tasks. However, adaptation to real-time variations in data distributions remains an important challenge. Test-Time Training (TTT) was proposed as an effective solution to this issue, which increases the generalization ability of trained models by adding an auxiliary task at train time and then using its loss at test time to adapt the model. Inspired by the recent achievements of contrastive representation learning in unsupervised tasks, we propose ReC-TTT, a test-time training technique that can adapt a DL model to new unseen domains by generating discriminative views of the input data. ReC-TTT uses cross-reconstruction as an auxiliary task between a frozen encoder and two trainable encoders, taking advantage of a single shared decoder. This enables, at test time, to adapt the encoders to extract features that will be correctly reconstructed by the decoder that, in this phase, is frozen on the source domain. Experimental results show that ReC-TTT achieves better results than other state-of-the-art techniques in most domain shift classification challenges.

Via

Access Paper or Ask Questions

A Transfer Learning and Explainable Solution to Detect mpox from Smartphones images

May 29, 2023

Mattia Giovanni Campana, Marco Colussi, Franca Delmastro, Sergio Mascetti, Elena Pagani

Figure 1 for A Transfer Learning and Explainable Solution to Detect mpox from Smartphones images

Figure 2 for A Transfer Learning and Explainable Solution to Detect mpox from Smartphones images

Figure 3 for A Transfer Learning and Explainable Solution to Detect mpox from Smartphones images

Figure 4 for A Transfer Learning and Explainable Solution to Detect mpox from Smartphones images

Abstract:In recent months, the monkeypox (mpox) virus -- previously endemic in a limited area of the world -- has started spreading in multiple countries until being declared a ``public health emergency of international concern'' by the World Health Organization. The alert was renewed in February 2023 due to a persisting sustained incidence of the virus in several countries and worries about possible new outbreaks. Low-income countries with inadequate infrastructures for vaccine and testing administration are particularly at risk. A symptom of mpox infection is the appearance of skin rashes and eruptions, which can drive people to seek medical advice. A technology that might help perform a preliminary screening based on the aspect of skin lesions is the use of Machine Learning for image classification. However, to make this technology suitable on a large scale, it should be usable directly on mobile devices of people, with a possible notification to a remote medical expert. In this work, we investigate the adoption of Deep Learning to detect mpox from skin lesion images. The proposal leverages Transfer Learning to cope with the scarce availability of mpox image datasets. As a first step, a homogenous, unpolluted, dataset is produced by manual selection and preprocessing of available image data. It will also be released publicly to researchers in the field. Then, a thorough comparison is conducted amongst several Convolutional Neural Networks, based on a 10-fold stratified cross-validation. The best models are then optimized through quantization for use on mobile devices; measures of classification quality, memory footprint, and processing times validate the feasibility of our proposal. Additionally, the use of eXplainable AI is investigated as a suitable instrument to both technically and clinically validate classification outcomes.

* Submitted to Pervasive and Mobile Computing

Via

Access Paper or Ask Questions

Ultrasound Detection of Subquadricipital Recess Distension

Nov 22, 2022

Marco Colussi, Gabriele Civitarese, Dragan Ahmetovic, Claudio Bettini, Roberta Gualtierotti, Flora Peyvandi, Sergio Mascetti

Figure 1 for Ultrasound Detection of Subquadricipital Recess Distension

Figure 2 for Ultrasound Detection of Subquadricipital Recess Distension

Figure 3 for Ultrasound Detection of Subquadricipital Recess Distension

Figure 4 for Ultrasound Detection of Subquadricipital Recess Distension

Abstract:Joint bleeding is a common condition for people with hemophilia and, if untreated, can result in hemophilic arthropathy. Ultrasound imaging has recently emerged as an effective tool to diagnose joint recess distension caused by joint bleeding. However, no computer-aided diagnosis tool exists to support the practitioner in the diagnosis process. This paper addresses the problem of automatically detecting the recess and assessing whether it is distended in knee ultrasound images collected in patients with hemophilia. After framing the problem, we propose two different approaches: the first one adopts a one-stage object detection algorithm, while the second one is a multi-task approach with a classification and a detection branch. The experimental evaluation, conducted with $483$ annotated images, shows that the solution based on object detection alone has a balanced accuracy score of $0.74$ with a mean IoU value of $0.66$, while the multi-task approach has a higher balanced accuracy value ($0.78$) at the cost of a slightly lower mean IoU value.

Via

Access Paper or Ask Questions

Interpreting deep urban sound classification using Layer-wise Relevance Propagation

Nov 19, 2021

Marco Colussi, Stavros Ntalampiras

Figure 1 for Interpreting deep urban sound classification using Layer-wise Relevance Propagation

Figure 2 for Interpreting deep urban sound classification using Layer-wise Relevance Propagation

Figure 3 for Interpreting deep urban sound classification using Layer-wise Relevance Propagation

Figure 4 for Interpreting deep urban sound classification using Layer-wise Relevance Propagation

Abstract:After constructing a deep neural network for urban sound classification, this work focuses on the sensitive application of assisting drivers suffering from hearing loss. As such, clear etiology justifying and interpreting model predictions comprise a strong requirement. To this end, we used two different representations of audio signals, i.e. Mel and constant-Q spectrograms, while the decisions made by the deep neural network are explained via layer-wise relevance propagation. At the same time, frequency content assigned with high relevance in both feature sets, indicates extremely discriminative information characterizing the present classification task. Overall, we present an explainable AI framework for understanding deep urban sound classification.

Via

Access Paper or Ask Questions