Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Wenli Yang

Beyond Black-Box AI: Interpretable Hybrid Systems for Dementia Care

Jul 02, 2025

Matthew JY Kang, Wenli Yang, Monica R Roberts, Byeong Ho Kang, Charles B Malpas

Abstract:The recent boom of large language models (LLMs) has re-ignited the hope that artificial intelligence (AI) systems could aid medical diagnosis. Yet despite dazzling benchmark scores, LLM assistants have yet to deliver measurable improvements at the bedside. This scoping review aims to highlight the areas where AI is limited to make practical contributions in the clinical setting, specifically in dementia diagnosis and care. Standalone machine-learning models excel at pattern recognition but seldom provide actionable, interpretable guidance, eroding clinician trust. Adjacent use of LLMs by physicians did not result in better diagnostic accuracy or speed. Key limitations trace to the data-driven paradigm: black-box outputs which lack transparency, vulnerability to hallucinations, and weak causal reasoning. Hybrid approaches that combine statistical learning with expert rule-based knowledge, and involve clinicians throughout the process help bring back interpretability. They also fit better with existing clinical workflows, as seen in examples like PEIRS and ATHENA-CDS. Future decision-support should prioritise explanatory coherence by linking predictions to clinically meaningful causes. This can be done through neuro-symbolic or hybrid AI that combines the language ability of LLMs with human causal expertise. AI researchers have addressed this direction, with explainable AI and neuro-symbolic AI being the next logical steps in further advancement in AI. However, they are still based on data-driven knowledge integration instead of human-in-the-loop approaches. Future research should measure success not only by accuracy but by improvements in clinician understanding, workflow fit, and patient outcomes. A better understanding of what helps improve human-computer interactions is greatly needed for AI systems to become part of clinical practice.

Via

Access Paper or Ask Questions

Seed1.5-VL Technical Report

May 11, 2025

Dong Guo, Faming Wu, Feida Zhu, Fuxing Leng, Guang Shi, Haobin Chen, Haoqi Fan, Jian Wang, Jianyu Jiang, Jiawei Wang(+187 more)

Abstract:We present Seed1.5-VL, a vision-language foundation model designed to advance general-purpose multimodal understanding and reasoning. Seed1.5-VL is composed with a 532M-parameter vision encoder and a Mixture-of-Experts (MoE) LLM of 20B active parameters. Despite its relatively compact architecture, it delivers strong performance across a wide spectrum of public VLM benchmarks and internal evaluation suites, achieving the state-of-the-art performance on 38 out of 60 public benchmarks. Moreover, in agent-centric tasks such as GUI control and gameplay, Seed1.5-VL outperforms leading multimodal systems, including OpenAI CUA and Claude 3.7. Beyond visual and video understanding, it also demonstrates strong reasoning abilities, making it particularly effective for multimodal reasoning challenges such as visual puzzles. We believe these capabilities will empower broader applications across diverse tasks. In this report, we mainly provide a comprehensive review of our experiences in building Seed1.5-VL across model design, data construction, and training at various stages, hoping that this report can inspire further research. Seed1.5-VL is now accessible at https://www.volcengine.com/ (Volcano Engine Model ID: doubao-1-5-thinking-vision-pro-250428)

Via

Access Paper or Ask Questions

Image Segmentation via Variational Model Based Tailored UNet: A Deep Variational Framework

May 09, 2025

Kaili Qi, Wenli Yang, Ye Li, Zhongyi Huang

Abstract:Traditional image segmentation methods, such as variational models based on partial differential equations (PDEs), offer strong mathematical interpretability and precise boundary modeling, but often suffer from sensitivity to parameter settings and high computational costs. In contrast, deep learning models such as UNet, which are relatively lightweight in parameters, excel in automatic feature extraction but lack theoretical interpretability and require extensive labeled data. To harness the complementary strengths of both paradigms, we propose Variational Model Based Tailored UNet (VM_TUNet), a novel hybrid framework that integrates the fourth-order modified Cahn-Hilliard equation with the deep learning backbone of UNet, which combines the interpretability and edge-preserving properties of variational methods with the adaptive feature learning of neural networks. Specifically, a data-driven operator is introduced to replace manual parameter tuning, and we incorporate the tailored finite point method (TFPM) to enforce high-precision boundary preservation. Experimental results on benchmark datasets demonstrate that VM_TUNet achieves superior segmentation performance compared to existing approaches, especially for fine boundary delineation.

Via

Access Paper or Ask Questions

Enhancing tutoring systems by leveraging tailored promptings and domain knowledge with Large Language Models

May 02, 2025

Mohsen Balavar, Wenli Yang, David Herbert, Soonja Yeom

Abstract:Recent advancements in artificial intelligence (AI) and machine learning have reignited interest in their impact on Computer-based Learning (CBL). AI-driven tools like ChatGPT and Intelligent Tutoring Systems (ITS) have enhanced learning experiences through personalisation and flexibility. ITSs can adapt to individual learning needs and provide customised feedback based on a student's performance, cognitive state, and learning path. Despite these advances, challenges remain in accommodating diverse learning styles and delivering real-time, context-aware feedback. Our research aims to address these gaps by integrating skill-aligned feedback via Retrieval Augmented Generation (RAG) into prompt engineering for Large Language Models (LLMs) and developing an application to enhance learning through personalised tutoring in a computer science programming context. The pilot study evaluated a proposed system using three quantitative metrics: readability score, response time, and feedback depth, across three programming tasks of varying complexity. The system successfully sorted simulated students into three skill-level categories and provided context-aware feedback. This targeted approach demonstrated better effectiveness and adaptability compared to general methods.

Via

Access Paper or Ask Questions

Impact and influence of modern AI in metadata management

Jan 28, 2025

Wenli Yang, Rui Fu, Muhammad Bilal Amin, Byeong Kang

Abstract:Metadata management plays a critical role in data governance, resource discovery, and decision-making in the data-driven era. While traditional metadata approaches have primarily focused on organization, classification, and resource reuse, the integration of modern artificial intelligence (AI) technologies has significantly transformed these processes. This paper investigates both traditional and AI-driven metadata approaches by examining open-source solutions, commercial tools, and research initiatives. A comparative analysis of traditional and AI-driven metadata management methods is provided, highlighting existing challenges and their impact on next-generation datasets. The paper also presents an innovative AI-assisted metadata management framework designed to address these challenges. This framework leverages more advanced modern AI technologies to automate metadata generation, enhance governance, and improve the accessibility and usability of modern datasets. Finally, the paper outlines future directions for research and development, proposing opportunities to further advance metadata management in the context of AI-driven innovation and complex datasets.

Via

Access Paper or Ask Questions

Advancing Oyster Phenotype Segmentation with Multi-Network Ensemble and Multi-Scale mechanism

Jan 20, 2025

Wenli Yang, Yanyu Chen, Andrew Trotter, Byeong Kang

Abstract:Phenotype segmentation is pivotal in analysing visual features of living organisms, enhancing our understanding of their characteristics. In the context of oysters, meat quality assessment is paramount, focusing on shell, meat, gonad, and muscle components. Traditional manual inspection methods are time-consuming and subjective, prompting the adoption of machine vision technology for efficient and objective evaluation. We explore machine vision's capacity for segmenting oyster components, leading to the development of a multi-network ensemble approach with a global-local hierarchical attention mechanism. This approach integrates predictions from diverse models and addresses challenges posed by varying scales, ensuring robust instance segmentation across components. Finally, we provide a comprehensive evaluation of the proposed method's performance using different real-world datasets, highlighting its efficacy and robustness in enhancing oyster phenotype segmentation.

Via

Access Paper or Ask Questions

A Concept-based Interpretable Model for the Diagnosis of Choroid Neoplasias using Multimodal Data

Mar 08, 2024

Yifan Wu, Yang Liu, Yue Yang, Michael S. Yao, Wenli Yang, Xuehui Shi, Lihong Yang, Dongjun Li, Yueming Liu, James C. Gee(+3 more)

Abstract:Diagnosing rare diseases presents a common challenge in clinical practice, necessitating the expertise of specialists for accurate identification. The advent of machine learning offers a promising solution, while the development of such technologies is hindered by the scarcity of data on rare conditions and the demand for models that are both interpretable and trustworthy in a clinical context. Interpretable AI, with its capacity for human-readable outputs, can facilitate validation by clinicians and contribute to medical education. In the current work, we focus on choroid neoplasias, the most prevalent form of eye cancer in adults, albeit rare with 5.1 per million. We built the so-far largest dataset consisting of 750 patients, incorporating three distinct imaging modalities collected from 2004 to 2022. Our work introduces a concept-based interpretable model that distinguishes between three types of choroidal tumors, integrating insights from domain experts via radiological reports. Remarkably, this model not only achieves an F1 score of 0.91, rivaling that of black-box models, but also boosts the diagnostic accuracy of junior doctors by 42%. This study highlights the significant potential of interpretable machine learning in improving the diagnosis of rare diseases, laying a groundwork for future breakthroughs in medical AI that could tackle a wider array of complex health scenarios.

Via

Access Paper or Ask Questions

Proceedings of Principle and practice of data and Knowledge Acquisition Workshop 2022

Nov 07, 2022

Qing Liu, Wenli Yang, Shiqing Wu

Abstract:Over the past two decades, PKAW has provided a forum for researchers and practitioners to discuss the state-of-the-arts in the area of knowledge acquisition and machine intelligence (MI, also Artificial Intelligence, AI). PKAW2022 will continue the above focus and welcome the contributions on the multi-disciplinary approach of human and big data-driven knowledge acquisition, as well as AI techniques and applications.

* Proceedings of Principle and practice of data and Knowledge Acquisition Workshop 2022 (PKAW 2022)

Via

Access Paper or Ask Questions

Hybrid CNN -Interpreter: Interpret local and global contexts for CNN-based Models

Oct 31, 2022

Wenli Yang, Guan Huang, Renjie Li, Jiahao Yu, Yanyu Chen, Quan Bai, Beyong Kang

Figure 1 for Hybrid CNN -Interpreter: Interpret local and global contexts for CNN-based Models

Figure 2 for Hybrid CNN -Interpreter: Interpret local and global contexts for CNN-based Models

Figure 3 for Hybrid CNN -Interpreter: Interpret local and global contexts for CNN-based Models

Figure 4 for Hybrid CNN -Interpreter: Interpret local and global contexts for CNN-based Models

Abstract:Convolutional neural network (CNN) models have seen advanced improvements in performance in various domains, but lack of interpretability is a major barrier to assurance and regulation during operation for acceptance and deployment of AI-assisted applications. There have been many works on input interpretability focusing on analyzing the input-output relations, but the internal logic of models has not been clarified in the current mainstream interpretability methods. In this study, we propose a novel hybrid CNN-interpreter through: (1) An original forward propagation mechanism to examine the layer-specific prediction results for local interpretability. (2) A new global interpretability that indicates the feature correlation and filter importance effects. By combining the local and global interpretabilities, hybrid CNN-interpreter enables us to have a solid understanding and monitoring of model context during the whole learning process with detailed and consistent representations. Finally, the proposed interpretabilities have been demonstrated to adapt to various CNN-based model structures.

Via

Access Paper or Ask Questions

A Comprehensive Review on Deep Supervision: Theories and Applications

Jul 06, 2022

Renjie Li, Xinyi Wang, Guan Huang, Wenli Yang, Kaining Zhang, Xiaotong Gu, Son N. Tran, Saurabh Garg, Jane Alty, Quan Bai

Figure 1 for A Comprehensive Review on Deep Supervision: Theories and Applications

Figure 2 for A Comprehensive Review on Deep Supervision: Theories and Applications

Figure 3 for A Comprehensive Review on Deep Supervision: Theories and Applications

Figure 4 for A Comprehensive Review on Deep Supervision: Theories and Applications

Abstract:Deep supervision, or known as 'intermediate supervision' or 'auxiliary supervision', is to add supervision at hidden layers of a neural network. This technique has been increasingly applied in deep neural network learning systems for various computer vision applications recently. There is a consensus that deep supervision helps improve neural network performance by alleviating the gradient vanishing problem, as one of the many strengths of deep supervision. Besides, in different computer vision applications, deep supervision can be applied in different ways. How to make the most use of deep supervision to improve network performance in different applications has not been thoroughly investigated. In this paper, we provide a comprehensive in-depth review of deep supervision in both theories and applications. We propose a new classification of different deep supervision networks, and discuss advantages and limitations of current deep supervision networks in computer vision applications.

Via

Access Paper or Ask Questions