Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Do Multimodal Large Language Models Understand Welding?

Mar 18, 2025

Grigorii Khvatskii, Yong Suk Lee, Corey Angst, Maria Gibbs, Robert Landers, Nitesh V. Chawla

Figure 1 for Do Multimodal Large Language Models Understand Welding?

Figure 2 for Do Multimodal Large Language Models Understand Welding?

Figure 3 for Do Multimodal Large Language Models Understand Welding?

Figure 4 for Do Multimodal Large Language Models Understand Welding?

Share this with someone who'll enjoy it:

Abstract:This paper examines the performance of Multimodal LLMs (MLLMs) in skilled production work, with a focus on welding. Using a novel data set of real-world and online weld images, annotated by a domain expert, we evaluate the performance of two state-of-the-art MLLMs in assessing weld acceptability across three contexts: RV \& Marine, Aeronautical, and Farming. While both models perform better on online images, likely due to prior exposure or memorization, they also perform relatively well on unseen, real-world weld images. Additionally, we introduce WeldPrompt, a prompting strategy that combines Chain-of-Thought generation with in-context learning to mitigate hallucinations and improve reasoning. WeldPrompt improves model recall in certain contexts but exhibits inconsistent performance across others. These results underscore the limitations and potentials of MLLMs in high-stakes technical domains and highlight the importance of fine-tuning, domain-specific data, and more sophisticated prompting strategies to improve model reliability. The study opens avenues for further research into multimodal learning in industry applications.

* 16 pages

View paper on

Share this with someone who'll enjoy it:

Title:Do Multimodal Large Language Models Understand Welding?

Paper and Code