Abstract:Data in the healthcare domain arise from a variety of sources and modalities, such as x-ray images, continuous measurements, and clinical notes. Medical practitioners integrate these diverse data types daily to make informed and accurate decisions. With recent advancements in language models capable of handling multimodal data, it is a logical progression to apply these models to the healthcare sector. In this work, we introduce a framework that connects small language models to multiple data sources, aiming to predict the risk of various diseases simultaneously. Our experiments encompass 12 different tasks within a multitask learning setup. Although our approach does not surpass state-of-the-art methods specialized for single tasks, it demonstrates competitive performance and underscores the potential of small language models for multimodal reasoning in healthcare.
Abstract:The lack of sufficiently large open medical databases is one of the biggest challenges in AI-powered healthcare. Synthetic data created using Generative Adversarial Networks (GANs) appears to be a good solution to mitigate the issues with privacy policies. The other type of cure is decentralized protocol across multiple medical institutions without exchanging local data samples. In this paper, we explored unconditional and conditional GANs in centralized and decentralized settings. The centralized setting imitates studies on large but highly unbalanced skin lesion dataset, while the decentralized one simulates a more realistic hospital scenario with three institutions. We evaluated models' performance in terms of fidelity, diversity, speed of training, and predictive ability of classifiers trained on the generated synthetic data. In addition we provided explainability through exploration of latent space and embeddings projection focused both on global and local explanations. Calculated distance between real images and their projections in the latent space proved the authenticity and generalization of trained GANs, which is one of the main concerns in this type of applications. The open source code for conducted studies is publicly available at \url{https://github.com/aidotse/stylegan2-ada-pytorch}.