Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Benoit Macq

Patient-specific vs Multi-Patient Vision Transformer for Markerless Tumor Motion Forecasting

Jul 10, 2025

Gauthier Rotsart de Hertaing, Dani Manjah, Benoit Macq

Abstract:Background: Accurate forecasting of lung tumor motion is essential for precise dose delivery in proton therapy. While current markerless methods mostly rely on deep learning, transformer-based architectures remain unexplored in this domain, despite their proven performance in trajectory forecasting. Purpose: This work introduces a markerless forecasting approach for lung tumor motion using Vision Transformers (ViT). Two training strategies are evaluated under clinically realistic constraints: a patient-specific (PS) approach that learns individualized motion patterns, and a multi-patient (MP) model designed for generalization. The comparison explicitly accounts for the limited number of images that can be generated between planning and treatment sessions. Methods: Digitally reconstructed radiographs (DRRs) derived from planning 4DCT scans of 31 patients were used to train the MP model; a 32nd patient was held out for evaluation. PS models were trained using only the target patient's planning data. Both models used 16 DRRs per input and predicted tumor motion over a 1-second horizon. Performance was assessed using Average Displacement Error (ADE) and Final Displacement Error (FDE), on both planning (T1) and treatment (T2) data. Results: On T1 data, PS models outperformed MP models across all training set sizes, especially with larger datasets (up to 25,000 DRRs, p < 0.05). However, MP models demonstrated stronger robustness to inter-fractional anatomical variability and achieved comparable performance on T2 data without retraining. Conclusions: This is the first study to apply ViT architectures to markerless tumor motion forecasting. While PS models achieve higher precision, MP models offer robust out-of-the-box performance, well-suited for time-constrained clinical settings.

Via

Access Paper or Ask Questions

CIA: Controllable Image Augmentation Framework Based on Stable Diffusion

Nov 25, 2024

Mohamed Benkedadra, Dany Rimez, Tiffanie Godelaine, Natarajan Chidambaram, Hamed Razavi Khosroshahi, Horacio Tellez, Matei Mancas, Benoit Macq, Sidi Ahmed Mahmoudi

Figure 1 for CIA: Controllable Image Augmentation Framework Based on Stable Diffusion

Figure 2 for CIA: Controllable Image Augmentation Framework Based on Stable Diffusion

Figure 3 for CIA: Controllable Image Augmentation Framework Based on Stable Diffusion

Figure 4 for CIA: Controllable Image Augmentation Framework Based on Stable Diffusion

Abstract:Computer vision tasks such as object detection and segmentation rely on the availability of extensive, accurately annotated datasets. In this work, We present CIA, a modular pipeline, for (1) generating synthetic images for dataset augmentation using Stable Diffusion, (2) filtering out low quality samples using defined quality metrics, (3) forcing the existence of specific patterns in generated images using accurate prompting and ControlNet. In order to show how CIA can be used to search for an optimal augmentation pipeline of training data, we study human object detection in a data constrained scenario, using YOLOv8n on COCO and Flickr30k datasets. We have recorded significant improvement using CIA-generated images, approaching the performances obtained when doubling the amount of real images in the dataset. Our findings suggest that our modular framework can significantly enhance object detection systems, and make it possible for future research to be done on data-constrained scenarios. The framework is available at: github.com/multitel-ai/CIA.

* 2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR) 2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR) 10.1109/MIPR62202.2024

Via

Access Paper or Ask Questions

Camera clustering for scalable stream-based active distillation

Apr 16, 2024

Dani Manjah, Davide Cacciarelli, Christophe De Vleeschouwer, Benoit Macq

Abstract:We present a scalable framework designed to craft efficient lightweight models for video object detection utilizing self-training and knowledge distillation techniques. We scrutinize methodologies for the ideal selection of training images from video streams and the efficacy of model sharing across numerous cameras. By advocating for a camera clustering methodology, we aim to diminish the requisite number of models for training while augmenting the distillation dataset. The findings affirm that proper camera clustering notably amplifies the accuracy of distilled models, eclipsing the methodologies that employ distinct models for each camera or a universal model trained on the aggregate camera data.

* This manuscript is currently under review at IEEE Transactions on Circuits and Systems for Video Technology

Via

Access Paper or Ask Questions

Streamlined Hybrid Annotation Framework using Scalable Codestream for Bandwidth-Restricted UAV Object Detection

Feb 07, 2024

Karim El Khoury, Tiffanie Godelaine, Simon Delvaux, Sebastien Lugan, Benoit Macq

Abstract:Emergency response missions depend on the fast relay of visual information, a task to which unmanned aerial vehicles are well adapted. However, the effective use of unmanned aerial vehicles is often compromised by bandwidth limitations that impede fast data transmission, thereby delaying the quick decision-making necessary in emergency situations. To address these challenges, this paper presents a streamlined hybrid annotation framework that utilizes the JPEG 2000 compression algorithm to facilitate object detection under limited bandwidth. The proposed framework employs a fine-tuned deep learning network for initial image annotation at lower resolutions and uses JPEG 2000's scalable codestream to selectively enhance the image resolution in critical areas that require human expert annotation. We show that our proposed hybrid framework reduces the response time by a factor of 34 in emergency situations compared to a baseline approach.

Via

Access Paper or Ask Questions

Solving inverse problems with deep neural networks driven by sparse signal decomposition in a physics-based dictionary

Jul 16, 2021

Gaetan Rensonnet, Louise Adam, Benoit Macq

Figure 1 for Solving inverse problems with deep neural networks driven by sparse signal decomposition in a physics-based dictionary

Figure 2 for Solving inverse problems with deep neural networks driven by sparse signal decomposition in a physics-based dictionary

Figure 3 for Solving inverse problems with deep neural networks driven by sparse signal decomposition in a physics-based dictionary

Figure 4 for Solving inverse problems with deep neural networks driven by sparse signal decomposition in a physics-based dictionary

Abstract:Deep neural networks (DNN) have an impressive ability to invert very complex models, i.e. to learn the generative parameters from a model's output. Once trained, the forward pass of a DNN is often much faster than traditional, optimization-based methods used to solve inverse problems. This is however done at the cost of lower interpretability, a fundamental limitation in most medical applications. We propose an approach for solving general inverse problems which combines the efficiency of DNN and the interpretability of traditional analytical methods. The measurements are first projected onto a dense dictionary of model-based responses. The resulting sparse representation is then fed to a DNN with an architecture driven by the problem's physics for fast parameter learning. Our method can handle generative forward models that are costly to evaluate and exhibits similar performance in accuracy and computation time as a fully-learned DNN, while maintaining high interpretability and being easier to train. Concrete results are shown on an example of model-based brain parameter estimation from magnetic resonance imaging (MRI).

* Accepted for publication in Workshop on Interpretable ML in Healthcare at International Conference on Machine Learning (ICML) 2021. 10 pages (including 3 for references), 4 figures

Via

Access Paper or Ask Questions

Prognostic Power of Texture Based Morphological Operations in a Radiomics Study for Lung Cancer

Dec 23, 2020

Paul Desbordes, Diksha, Benoit Macq

Figure 1 for Prognostic Power of Texture Based Morphological Operations in a Radiomics Study for Lung Cancer

Figure 2 for Prognostic Power of Texture Based Morphological Operations in a Radiomics Study for Lung Cancer

Figure 3 for Prognostic Power of Texture Based Morphological Operations in a Radiomics Study for Lung Cancer

Figure 4 for Prognostic Power of Texture Based Morphological Operations in a Radiomics Study for Lung Cancer

Abstract:The importance of radiomics features for predicting patient outcome is now well-established. Early study of prognostic features can lead to a more efficient treatment personalisation. For this reason new radiomics features obtained through mathematical morphology-based operations are proposed. Their study is conducted on an open database of patients suffering from Nonsmall Cells Lung Carcinoma (NSCLC). The tumor features are extracted from the CT images and analyzed via PCA and a Kaplan-Meier survival analysis in order to select the most relevant ones. Among the 1,589 studied features, 32 are found relevant to predict patient survival: 27 classical radiomics features and five MM features (including both granularity and morphological covariance features). These features will contribute towards the prognostic models, and eventually to clinical decision making and the course of treatment for patients.

* 9 pages, 3 tables, 3 figures, 31 references

Via

Access Paper or Ask Questions

Secure Architectures Implementing Trusted Coalitions for Blockchained Distributed Learning (TCLearn)

Jun 18, 2019

Sebastien Lugan, Paul Desbordes, Luis Xavier Ramos Tormo, Axel Legay, Benoit Macq

Figure 1 for Secure Architectures Implementing Trusted Coalitions for Blockchained Distributed Learning (TCLearn)

Figure 2 for Secure Architectures Implementing Trusted Coalitions for Blockchained Distributed Learning (TCLearn)

Figure 3 for Secure Architectures Implementing Trusted Coalitions for Blockchained Distributed Learning (TCLearn)

Figure 4 for Secure Architectures Implementing Trusted Coalitions for Blockchained Distributed Learning (TCLearn)

Abstract:Distributed learning across a coalition of organizations allows the members of the coalition to train and share a model without sharing the data used to optimize this model. In this paper, we propose new secure architectures that guarantee preservation of data privacy, trustworthy sequence of iterative learning and equitable sharing of the learned model among each member of the coalition by using adequate encryption and blockchain mechanisms. We exemplify its deployment in the case of the distributed optimization of a deep learning convolutional neural network trained on medical images.

Via

Access Paper or Ask Questions

Invariant Spectral Hashing of Image Saliency Graph

Sep 15, 2010

Maxime Taquet, Laurent Jacques, Christophe De Vleeschouwer, Benoit Macq

Figure 1 for Invariant Spectral Hashing of Image Saliency Graph

Figure 2 for Invariant Spectral Hashing of Image Saliency Graph

Figure 3 for Invariant Spectral Hashing of Image Saliency Graph

Figure 4 for Invariant Spectral Hashing of Image Saliency Graph

Abstract:Image hashing is the process of associating a short vector of bits to an image. The resulting summaries are useful in many applications including image indexing, image authentication and pattern recognition. These hashes need to be invariant under transformations of the image that result in similar visual content, but should drastically differ for conceptually distinct contents. This paper proposes an image hashing method that is invariant under rotation, scaling and translation of the image. The gist of our approach relies on the geometric characterization of salient point distribution in the image. This is achieved by the definition of a "saliency graph" connecting these points jointly with an image intensity function on the graph nodes. An invariant hash is then obtained by considering the spectrum of this function in the eigenvector basis of the Laplacian graph, that is, its graph Fourier transform. Interestingly, this spectrum is invariant under any relabeling of the graph nodes. The graph reveals geometric information of the image, making the hash robust to image transformation, yet distinct for different visual content. The efficiency of the proposed method is assessed on a set of MRI 2-D slices and on a database of faces.

* Keywords: Invariant Hashing, Geometrical Invariant, Spectral Graph, Salient Points. Content: 8 pages, 7 figures, 1 table

Via

Access Paper or Ask Questions