Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Laurin Lux

Synthetic Vasculature and Pathology Enhance Vision-Language Model Reasoning

Dec 11, 2025

Chenjun Li, Cheng Wan, Laurin Lux, Alexander Berger, Richard B. Rosen, Martin J. Menten, Johannes C. Paetzold

Abstract:Vision-Language Models (VLMs) offer a promising path toward interpretable medical diagnosis by allowing users to ask about clinical explanations alongside predictions and across different modalities. However, training VLMs for detailed reasoning requires large-scale image-text datasets. In many specialized domains, for example in reading Optical Coherence Tomography Angiography (OCTA) images, such precise text with grounded description of pathologies is scarce or even non-existent. To overcome this bottleneck, we introduce Synthetic Vasculature Reasoning (SVR), a framework that controllably synthesizes images and corresponding text, specifically: realistic retinal vasculature with Diabetic Retinopathy (DR) features: capillary dropout, microaneurysms, neovascularization, and tortuosity, while automatically generating granular reasoning texts. Based on this we curate OCTA-100K-SVR, an OCTA image-reasoning dataset with 100,000 pairs. Our experiments show that a general-purpose VLM (Qwen3-VL-8b) trained on the dataset achieves a zero-shot balanced classification accuracy of 89.67% on real OCTA images, outperforming supervised baselines. Through human expert evaluation we also demonstrate that it significantly enhances explanation quality and pathology localization on clinical data.

* 23 pages, 8 figures, 6 tables. Full paper under review for MIDL 2026 (Medical Imaging with Deep Learning)

Via

Access Paper or Ask Questions

Addressing Annotation Scarcity in Hyperspectral Brain Image Segmentation with Unsupervised Domain Adaptation

Aug 23, 2025

Tim Mach, Daniel Rueckert, Alex Berger, Laurin Lux, Ivan Ezhov

Abstract:This work presents a novel deep learning framework for segmenting cerebral vasculature in hyperspectral brain images. We address the critical challenge of severe label scarcity, which impedes conventional supervised training. Our approach utilizes a novel unsupervised domain adaptation methodology, using a small, expert-annotated ground truth alongside unlabeled data. Quantitative and qualitative evaluations confirm that our method significantly outperforms existing state-of-the-art approaches, demonstrating the efficacy of domain adaptation for label-scarce biomedical imaging tasks.

Via

Access Paper or Ask Questions

From Pixels to Histopathology: A Graph-Based Framework for Interpretable Whole Slide Image Analysis

Mar 14, 2025

Alexander Weers, Alexander H. Berger, Laurin Lux, Peter Schüffler, Daniel Rueckert, Johannes C. Paetzold

Abstract:The histopathological classification of whole-slide images (WSIs) is a fundamental task in digital pathology; yet it requires extensive time and expertise from specialists. While deep learning methods show promising results, they typically process WSIs by dividing them into artificial patches, which inherently prevents a network from learning from the entire image context, disregards natural tissue structures and compromises interpretability. Our method overcomes this limitation through a novel graph-based framework that constructs WSI graph representations. The WSI-graph efficiently captures essential histopathological information in a compact form. We build tissue representations (nodes) that follow biological boundaries rather than arbitrary patches all while providing interpretable features for explainability. Through adaptive graph coarsening guided by learned embeddings, we progressively merge regions while maintaining discriminative local features and enabling efficient global information exchange. In our method's final step, we solve the diagnostic task through a graph attention network. We empirically demonstrate strong performance on multiple challenging tasks such as cancer stage classification and survival prediction, while also identifying predictive factors using Integrated Gradients. Our implementation is publicly available at https://github.com/HistoGraph31/pix2pathology

* 11 pages, 2 figures

Via

Access Paper or Ask Questions

Fine-tuning Vision Language Models with Graph-based Knowledge for Explainable Medical Image Analysis

Mar 12, 2025

Chenjun Li, Laurin Lux, Alexander H. Berger, Martin J. Menten, Mert R. Sabuncu, Johannes C. Paetzold

Figure 1 for Fine-tuning Vision Language Models with Graph-based Knowledge for Explainable Medical Image Analysis

Figure 2 for Fine-tuning Vision Language Models with Graph-based Knowledge for Explainable Medical Image Analysis

Figure 3 for Fine-tuning Vision Language Models with Graph-based Knowledge for Explainable Medical Image Analysis

Figure 4 for Fine-tuning Vision Language Models with Graph-based Knowledge for Explainable Medical Image Analysis

Abstract:Accurate staging of Diabetic Retinopathy (DR) is essential for guiding timely interventions and preventing vision loss. However, current staging models are hardly interpretable, and most public datasets contain no clinical reasoning or interpretation beyond image-level labels. In this paper, we present a novel method that integrates graph representation learning with vision-language models (VLMs) to deliver explainable DR diagnosis. Our approach leverages optical coherence tomography angiography (OCTA) images by constructing biologically informed graphs that encode key retinal vascular features such as vessel morphology and spatial connectivity. A graph neural network (GNN) then performs DR staging while integrated gradients highlight critical nodes and edges and their individual features that drive the classification decisions. We collect this graph-based knowledge which attributes the model's prediction to physiological structures and their characteristics. We then transform it into textual descriptions for VLMs. We perform instruction-tuning with these textual descriptions and the corresponding image to train a student VLM. This final agent can classify the disease and explain its decision in a human interpretable way solely based on a single image input. Experimental evaluations on both proprietary and public datasets demonstrate that our method not only improves classification accuracy but also offers more clinically interpretable results. An expert study further demonstrates that our method provides more accurate diagnostic explanations and paves the way for precise localization of pathologies in OCTA images.

* 11 pages, 3 figures

Via

Access Paper or Ask Questions

Interpretable Retinal Disease Prediction Using Biology-Informed Heterogeneous Graph Representations

Feb 23, 2025

Laurin Lux, Alexander H. Berger, Maria Romeo Tricas, Alaa E. Fayed, Sobha Sivaprasada, Linus Kreitner, Jonas Weidner, Martin J. Menten, Daniel Rueckert, Johannes C. Paetzold

Figure 1 for Interpretable Retinal Disease Prediction Using Biology-Informed Heterogeneous Graph Representations

Figure 2 for Interpretable Retinal Disease Prediction Using Biology-Informed Heterogeneous Graph Representations

Figure 3 for Interpretable Retinal Disease Prediction Using Biology-Informed Heterogeneous Graph Representations

Figure 4 for Interpretable Retinal Disease Prediction Using Biology-Informed Heterogeneous Graph Representations

Abstract:Interpretability is crucial to enhance trust in machine learning models for medical diagnostics. However, most state-of-the-art image classifiers based on neural networks are not interpretable. As a result, clinicians often resort to known biomarkers for diagnosis, although biomarker-based classification typically performs worse than large neural networks. This work proposes a method that surpasses the performance of established machine learning models while simultaneously improving prediction interpretability for diabetic retinopathy staging from optical coherence tomography angiography (OCTA) images. Our method is based on a novel biology-informed heterogeneous graph representation that models retinal vessel segments, intercapillary areas, and the foveal avascular zone (FAZ) in a human-interpretable way. This graph representation allows us to frame diabetic retinopathy staging as a graph-level classification task, which we solve using an efficient graph neural network. We benchmark our method against well-established baselines, including classical biomarker-based classifiers, convolutional neural networks (CNNs), and vision transformers. Our model outperforms all baselines on two datasets. Crucially, we use our biology-informed graph to provide explanations of unprecedented detail. Our approach surpasses existing methods in precisely localizing and identifying critical vessels or intercapillary areas. In addition, we give informative and human-interpretable attributions to critical characteristics. Our work contributes to the development of clinical decision-support tools in ophthalmology.

Via

Access Paper or Ask Questions

Pitfalls of topology-aware image segmentation

Dec 19, 2024

Alexander H. Berger, Laurin Lux, Alexander Weers, Martin Menten, Daniel Rueckert, Johannes C. Paetzold

Figure 1 for Pitfalls of topology-aware image segmentation

Figure 2 for Pitfalls of topology-aware image segmentation

Figure 3 for Pitfalls of topology-aware image segmentation

Figure 4 for Pitfalls of topology-aware image segmentation

Abstract:Topological correctness, i.e., the preservation of structural integrity and specific characteristics of shape, is a fundamental requirement for medical imaging tasks, such as neuron or vessel segmentation. Despite the recent surge in topology-aware methods addressing this challenge, their real-world applicability is hindered by flawed benchmarking practices. In this paper, we identify critical pitfalls in model evaluation that include inadequate connectivity choices, overlooked topological artifacts in ground truth annotations, and inappropriate use of evaluation metrics. Through detailed empirical analysis, we uncover these issues' profound impact on the evaluation and ranking of segmentation methods. Drawing from our findings, we propose a set of actionable recommendations to establish fair and robust evaluation standards for topology-aware medical image segmentation methods.

* Code is available at https://github.com/AlexanderHBerger/topo-pitfalls

Via

Access Paper or Ask Questions

Spatial Brain Tumor Concentration Estimation for Individualized Radiotherapy Planning

Dec 18, 2024

Jonas Weidner, Michal Balcerak, Ivan Ezhov, André Datchev, Laurin Lux, Lucas Zimmerand Daniel Rueckert, Björn Menze, Benedikt Wiestler

Figure 1 for Spatial Brain Tumor Concentration Estimation for Individualized Radiotherapy Planning

Figure 2 for Spatial Brain Tumor Concentration Estimation for Individualized Radiotherapy Planning

Figure 3 for Spatial Brain Tumor Concentration Estimation for Individualized Radiotherapy Planning

Figure 4 for Spatial Brain Tumor Concentration Estimation for Individualized Radiotherapy Planning

Abstract:Biophysical modeling of brain tumors has emerged as a promising strategy for personalizing radiotherapy planning by estimating the otherwise hidden distribution of tumor cells within the brain. However, many existing state-of-the-art methods are computationally intensive, limiting their widespread translation into clinical practice. In this work, we propose an efficient and direct method that utilizes soft physical constraints to estimate the tumor cell concentration from preoperative MRI of brain tumor patients. Our approach optimizes a 3D tumor concentration field by simultaneously minimizing the difference between the observed MRI and a physically informed loss function. Compared to existing state-of-the-art techniques, our method significantly improves predicting tumor recurrence on two public datasets with a total of 192 patients while maintaining a clinically viable runtime of under one minute - a substantial reduction from the 30 minutes required by the current best approach. Furthermore, we showcase the generalizability of our framework by incorporating additional imaging information and physical constraints, highlighting its potential to translate to various medical diffusion phenomena with imperfect data.

Via

Access Paper or Ask Questions

Topograph: An efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation

Nov 05, 2024

Laurin Lux, Alexander H. Berger, Alexander Weers, Nico Stucki, Daniel Rueckert, Ulrich Bauer, Johannes C. Paetzold

Figure 1 for Topograph: An efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation

Figure 2 for Topograph: An efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation

Figure 3 for Topograph: An efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation

Figure 4 for Topograph: An efficient Graph-Based Framework for Strictly Topology Preserving Image Segmentation

Abstract:Topological correctness plays a critical role in many image segmentation tasks, yet most networks are trained using pixel-wise loss functions, such as Dice, neglecting topological accuracy. Existing topology-aware methods often lack robust topological guarantees, are limited to specific use cases, or impose high computational costs. In this work, we propose a novel, graph-based framework for topologically accurate image segmentation that is both computationally efficient and generally applicable. Our method constructs a component graph that fully encodes the topological information of both the prediction and ground truth, allowing us to efficiently identify topologically critical regions and aggregate a loss based on local neighborhood information. Furthermore, we introduce a strict topological metric capturing the homotopy equivalence between the union and intersection of prediction-label pairs. We formally prove the topological guarantees of our approach and empirically validate its effectiveness on binary and multi-class datasets. Our loss demonstrates state-of-the-art performance with up to fivefold faster loss computation compared to persistent homology methods.

Via

Access Paper or Ask Questions

Topologically faithful multi-class segmentation in medical images

Mar 16, 2024

Alexander H. Berger, Nico Stucki, Laurin Lux, Vincent Buergin, Suprosanna Shit, Anna Banaszak, Daniel Rueckert, Ulrich Bauer, Johannes C. Paetzold

Abstract:Topological accuracy in medical image segmentation is a highly important property for downstream applications such as network analysis and flow modeling in vessels or cell counting. Recently, significant methodological advancements have brought well-founded concepts from algebraic topology to binary segmentation. However, these approaches have been underexplored in multi-class segmentation scenarios, where topological errors are common. We propose a general loss function for topologically faithful multi-class segmentation extending the recent Betti matching concept, which is based on induced matchings of persistence barcodes. We project the N-class segmentation problem to N single-class segmentation tasks, which allows us to use 1-parameter persistent homology making training of neural networks computationally feasible. We validate our method on a comprehensive set of four medical datasets with highly variant topological characteristics. Our loss formulation significantly enhances topological correctness in cardiac, cell, artery-vein, and Circle of Willis segmentation.

Via

Access Paper or Ask Questions

Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers

Mar 11, 2024

Alexander H. Berger, Laurin Lux, Suprosanna Shit, Ivan Ezhov, Georgios Kaissis, Martin J. Menten, Daniel Rueckert, Johannes C. Paetzold

Figure 1 for Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers

Figure 2 for Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers

Figure 3 for Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers

Figure 4 for Cross-domain and Cross-dimension Learning for Image-to-Graph Transformers

Abstract:Direct image-to-graph transformation is a challenging task that solves object detection and relationship prediction in a single model. Due to the complexity of this task, large training datasets are rare in many domains, which makes the training of large networks challenging. This data sparsity necessitates the establishment of pre-training strategies akin to the state-of-the-art in computer vision. In this work, we introduce a set of methods enabling cross-domain and cross-dimension transfer learning for image-to-graph transformers. We propose (1) a regularized edge sampling loss for sampling the optimal number of object relationships (edges) across domains, (2) a domain adaptation framework for image-to-graph transformers that aligns features from different domains, and (3) a simple projection function that allows us to pretrain 3D transformers on 2D input data. We demonstrate our method's utility in cross-domain and cross-dimension experiments, where we pretrain our models on 2D satellite images before applying them to vastly different target domains in 2D and 3D. Our method consistently outperforms a series of baselines on challenging benchmarks, such as retinal or whole-brain vessel graph extraction.

Via

Access Paper or Ask Questions