Abstract:Transformers are widely adopted in modern vision models due to their strong ability to scale with dataset size and generalisability. However, this comes with a major drawback: computation scales quadratically to the total number of tokens. Numerous methods have been proposed to mitigate this. For example, we consider token pruning with reactivating tokens from preserved representations, but the increased computational efficiency of this method results in decreased stability from the preserved representations, leading to poorer dense prediction performance at deeper layers. In this work, we introduce Reg4Pru, a training regularisation technique that mitigates token-pruning performance loss for segmentation. We compare our models on the FIVES blood vessel segmentation dataset and find that Reg4Pru improves average precision by an absolute 46% compared to the same model trained without routing. This increase is observed using a configuration that achieves a 29% relative speedup in wall-clock time compared to the non-pruned baseline. These findings indicate that Reg4Pru is a valuable regulariser for token reduction strategies.
Abstract:Cephalometric Landmark Detection is the process of identifying key areas for cephalometry. Each landmark is a single GT point labelled by a clinician. A machine learning model predicts the probability locus of a landmark represented by a heatmap. This work, for the 2024 CL-Detection MICCAI Challenge, proposes a domain alignment strategy with a regional facial extraction module and an X-ray artefact augmentation procedure. The challenge ranks our method's results as the best in MRE of 1.186mm and third in the 2mm SDR of 82.04% on the online validation leaderboard. The code is available at https://github.com/Julian-Wyatt/OptimisingfortheUnknown.




Abstract:Anatomical Landmark Detection is the process of identifying key areas of an image for clinical measurements. Each landmark is a single ground truth point labelled by a clinician. A machine learning model predicts the locus of a landmark as a probability region represented by a heatmap. Diffusion models have increased in popularity for generative modelling due to their high quality sampling and mode coverage, leading to their adoption in medical image processing for semantic segmentation. Diffusion modelling can be further adapted to learn a distribution over landmarks. The stochastic nature of diffusion models captures fluctuations in the landmark prediction, which we leverage by blurring into meaningful probability regions. In this paper, we reformulate automatic Anatomical Landmark Detection as a precise generative modelling task, producing a few-hot pixel heatmap. Our method achieves state-of-the-art MRE and comparable SDR performance with existing work.