Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Belal Alsinglawi

A Lightweight Large Vision-language Model for Multimodal Medical Images

Apr 08, 2025

Belal Alsinglawi, Chris McCarthy, Sara Webb, Christopher Fluke, Navid Toosy Saidy

Abstract:Medical Visual Question Answering (VQA) enhances clinical decision-making by enabling systems to interpret medical images and answer clinical queries. However, developing efficient, high-performance VQA models is challenging due to the complexity of medical imagery and diverse modalities. In this paper, we introduce a lightweight, multimodal VQA model integrating BiomedCLIP for image feature extraction and LLaMA-3 for text processing. Designed for medical VQA tasks, our model achieves state-of-the-art performance on the OmniMedVQA dataset. With approximately 8 billion parameters, it requires only two NVIDIA 40 GB A100 GPUs, demonstrating superior efficiency over larger models. Our results show 73.4% accuracy for open-end questions, surpassing existing models and validating its potential for real-world medical applications. Key contributions include a specialized multimodal VQA model, a resource-efficient architecture, and strong performance in answering open-ended clinical questions.

* 10 pages, 4 figures

Via

Access Paper or Ask Questions

Indoor Positioning using Wi-Fi and Machine Learning for Industry 5.0

Mar 26, 2023

Inoj Neupane, Belal Alsinglawi, Khaled Rabie

Figure 1 for Indoor Positioning using Wi-Fi and Machine Learning for Industry 5.0

Figure 2 for Indoor Positioning using Wi-Fi and Machine Learning for Industry 5.0

Figure 3 for Indoor Positioning using Wi-Fi and Machine Learning for Industry 5.0

Abstract:Humans and robots working together in an environment to enhance human performance is the aim of Industry 5.0. Although significant progress in outdoor positioning has been seen, indoor positioning remains a challenge. In this paper, we introduce a new research concept by exploiting the potential of indoor positioning for Industry 5.0. We use Wi-Fi Received Signal Strength Indicator (RSSI) with trilateration using cheap and easily available ESP32 Arduino boards for positioning as well as sending effective route signals to a human and a robot working in a simulated-indoor factory environment in real-time. We utilized machine learning models to detect safe closeness between two co-workers (a human subject and a robot). Experimental data and analysis show an average deviation of less than 1m from the actual distance while the targets are mobile or stationary.

Via

Access Paper or Ask Questions