Abstract:Foundation models for echocardiography promise to reduce annotation burden and improve diagnostic consistency by learning generalizable representations from large unlabeled video archives. However, current approaches fail to disentangle anatomical signal from the stochastic speckle and acquisition artifacts that dominate ultrasound imagery. We present EchoJEPA, a foundation model for echocardiography trained on 18 million echocardiograms across 300K patients, the largest pretraining corpus for this modality to date. We also introduce a novel multi-view probing framework with factorized stream embeddings that standardizes evaluation under frozen backbones. Compared to prior methods, EchoJEPA reduces left ventricular ejection fraction estimation error by 19% and achieves 87.4% view classification accuracy. EchoJEPA exhibits strong sample efficiency, reaching 78.6% accuracy with only 1% of labeled data versus 42.1% for the best baseline trained on 100%. Under acoustic perturbations, EchoJEPA degrades by only 2.3% compared to 16.8% for the next best model, and transfers zero-shot to pediatric patients with 15% lower error than the next best model, outperforming all fine-tuned baselines. These results establish latent prediction as a superior paradigm for ultrasound foundation models.




Abstract:Prostate cancer is one of the most common causes of cancer deaths in men. There is a growing demand for noninvasively and accurately diagnostic methods that facilitate the current standard prostate cancer risk assessment in clinical practice. Still, developing computer-aided classification tools in prostate cancer diagnostics from multiparametric magnetic resonance images continues to be a challenge. In this work, we propose a novel deep learning approach for automatic classification of prostate lesions in the corresponding magnetic resonance images by constructing a two-stage multimodal multi-stream convolutional neural network (CNN)-based architecture framework. Without implementing sophisticated image preprocessing steps or third-party software, our framework achieved the classification performance with the area under a Receiver Operating Characteristic (ROC) curve value of 0.87. The result outperformed most of the submitted methods and shared the highest value reported by the PROSTATEx Challenge organizer. Our proposed CNN-based framework reflects the potential of assisting medical image interpretation in prostate cancer and reducing unnecessary biopsies.




Abstract:Beamforming in ultrasound imaging has significant impact on the quality of the final image, controlling its resolution and contrast. Despite its low spatial resolution and contrast, delay-and-sum is still extensively used nowadays in clinical applications, due to its real-time capabilities. The most common alternatives are minimum variance method and its variants, which overcome the drawbacks of delay-and-sum, at the cost of higher computational complexity that limits its utilization in real-time applications. In this paper, we propose to perform beamforming in ultrasound imaging through a regularized inverse problem based on a linear model relating the reflected echoes to the signal to be recovered. Our approach presents two major advantages: i) its flexibility in the choice of statistical assumptions on the signal to be beamformed (Laplacian and Gaussian statistics are tested herein) and ii) its robustness to a reduced number of pulse emissions. The proposed framework is flexible and allows for choosing the right trade-off between noise suppression and sharpness of the resulted image. We illustrate the performance of our approach on both simulated and experimental data, with \textit{in vivo} examples of carotid and thyroid. Compared to delay-and-sum, minimimum variance and two other recently published beamforming techniques, our method offers better spatial resolution, respectively contrast, when using Laplacian and Gaussian priors.