Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Zheng Lu

MMCR: Benchmarking Cross-Source Reasoning in Scientific Papers

Mar 21, 2025

Yang Tian, Zheng Lu, Mingqi Gao, Zheng Liu, Bo Zhao

Abstract:Fully comprehending scientific papers by machines reflects a high level of Artificial General Intelligence, requiring the ability to reason across fragmented and heterogeneous sources of information, presenting a complex and practically significant challenge. While Vision-Language Models (VLMs) have made remarkable strides in various tasks, particularly those involving reasoning with evidence source from single image or text page, their ability to use cross-source information for reasoning remains an open problem. This work presents MMCR, a high-difficulty benchmark designed to evaluate VLMs' capacity for reasoning with cross-source information from scientific papers. The benchmark comprises 276 high-quality questions, meticulously annotated by humans across 7 subjects and 10 task types. Experiments with 18 VLMs demonstrate that cross-source reasoning presents a substantial challenge for existing models. Notably, even the top-performing model, GPT-4o, achieved only 48.55% overall accuracy, with only 20% accuracy in multi-table comprehension tasks, while the second-best model, Qwen2.5-VL-72B, reached 39.86% overall accuracy. Furthermore, we investigated the impact of the Chain-of-Thought (CoT) technique on cross-source reasoning and observed a detrimental effect on small models, whereas larger models demonstrated substantially enhanced performance. These results highlight the pressing need to develop VLMs capable of effectively utilizing cross-source information for reasoning.

Via

Access Paper or Ask Questions

MedKAN: An Advanced Kolmogorov-Arnold Network for Medical Image Classification

Feb 25, 2025

Zhuoqin Yang, Jiansong Zhang, Xiaoling Luo, Zheng Lu, Linlin Shen

Abstract:Recent advancements in deep learning for image classification predominantly rely on convolutional neural networks (CNNs) or Transformer-based architectures. However, these models face notable challenges in medical imaging, particularly in capturing intricate texture details and contextual features. Kolmogorov-Arnold Networks (KANs) represent a novel class of architectures that enhance nonlinear transformation modeling, offering improved representation of complex features. In this work, we present MedKAN, a medical image classification framework built upon KAN and its convolutional extensions. MedKAN features two core modules: the Local Information KAN (LIK) module for fine-grained feature extraction and the Global Information KAN (GIK) module for global context integration. By combining these modules, MedKAN achieves robust feature modeling and fusion. To address diverse computational needs, we introduce three scalable variants--MedKAN-S, MedKAN-B, and MedKAN-L. Experimental results on nine public medical imaging datasets demonstrate that MedKAN achieves superior performance compared to CNN- and Transformer-based models, highlighting its effectiveness and generalizability in medical image analysis.

Via

Access Paper or Ask Questions

Activation Space Selectable Kolmogorov-Arnold Networks

Aug 15, 2024

Zhuoqin Yang, Jiansong Zhang, Xiaoling Luo, Zheng Lu, Linlin Shen

Figure 1 for Activation Space Selectable Kolmogorov-Arnold Networks

Figure 2 for Activation Space Selectable Kolmogorov-Arnold Networks

Figure 3 for Activation Space Selectable Kolmogorov-Arnold Networks

Figure 4 for Activation Space Selectable Kolmogorov-Arnold Networks

Abstract:The multilayer perceptron (MLP), a fundamental paradigm in current artificial intelligence, is widely applied in fields such as computer vision and natural language processing. However, the recently proposed Kolmogorov-Arnold Network (KAN), based on nonlinear additive connections, has been proven to achieve performance comparable to MLPs with significantly fewer parameters. Despite this potential, the use of a single activation function space results in reduced performance of KAN and related works across different tasks. To address this issue, we propose an activation space Selectable KAN (S-KAN). S-KAN employs an adaptive strategy to choose the possible activation mode for data at each feedforward KAN node. Our approach outperforms baseline methods in seven representative function fitting tasks and significantly surpasses MLP methods with the same level of parameters. Furthermore, we extend the structure of S-KAN and propose an activation space selectable Convolutional KAN (S-ConvKAN), which achieves leading results on four general image classification datasets. Our method mitigates the performance variability of the original KAN across different tasks and demonstrates through extensive experiments that feedforward KANs with selectable activations can achieve or even exceed the performance of MLP-based methods. This work contributes to the understanding of the data-centric design of new AI paradigms and provides a foundational reference for innovations in KAN-based network architectures.

* 12 pages, 6 figures. The code for this work will be released soon

Via

Access Paper or Ask Questions

Analytics and Machine Learning in Vehicle Routing Research

Feb 19, 2021

Ruibin Bai, Xinan Chen, Zhi-Long Chen, Tianxiang Cui, Shuhui Gong, Wentao He, Xiaoping Jiang, Huan Jin, Jiahuan Jin, Graham Kendall(+6 more)

Figure 1 for Analytics and Machine Learning in Vehicle Routing Research

Abstract:The Vehicle Routing Problem (VRP) is one of the most intensively studied combinatorial optimisation problems for which numerous models and algorithms have been proposed. To tackle the complexities, uncertainties and dynamics involved in real-world VRP applications, Machine Learning (ML) methods have been used in combination with analytical approaches to enhance problem formulations and algorithmic performance across different problem solving scenarios. However, the relevant papers are scattered in several traditional research fields with very different, sometimes confusing, terminologies. This paper presents a first, comprehensive review of hybrid methods that combine analytical techniques with ML tools in addressing VRP problems. Specifically, we review the emerging research streams on ML-assisted VRP modelling and ML-assisted VRP optimisation. We conclude that ML can be beneficial in enhancing VRP modelling, and improving the performance of algorithms for both online and offline VRP optimisations. Finally, challenges and future opportunities of VRP research are discussed.

* Submitted to International Journal of Production Research

Via

Access Paper or Ask Questions

Learning Regular Expressions for Interpretable Medical Text Classification Using a Pool-based Simulated Annealing and Word-vector Models

Nov 16, 2020

Chaofan Tu, Ruibin Bai, Zheng Lu, Uwe Aickelin, Peiming Ge, Jianshuang Zhao

Figure 1 for Learning Regular Expressions for Interpretable Medical Text Classification Using a Pool-based Simulated Annealing and Word-vector Models

Figure 2 for Learning Regular Expressions for Interpretable Medical Text Classification Using a Pool-based Simulated Annealing and Word-vector Models

Figure 3 for Learning Regular Expressions for Interpretable Medical Text Classification Using a Pool-based Simulated Annealing and Word-vector Models

Figure 4 for Learning Regular Expressions for Interpretable Medical Text Classification Using a Pool-based Simulated Annealing and Word-vector Models

Abstract:In this paper, we propose a rule-based engine composed of high quality and interpretable regular expressions for medical text classification. The regular expressions are auto generated by a constructive heuristic method and optimized using a Pool-based Simulated Annealing (PSA) approach. Although existing Deep Neural Network (DNN) methods present high quality performance in most Natural Language Processing (NLP) applications, the solutions are regarded as uninterpretable black boxes to humans. Therefore, rule-based methods are often introduced when interpretable solutions are needed, especially in the medical field. However, the construction of regular expressions can be extremely labor-intensive for large data sets. This research aims to reduce the manual efforts while maintaining high-quality solutions

* 9th Multidisciplinary International Conference on Scheduling : Theory and Applications (MISTA 2019) 12-15 December 2019, Ningbo, China

Via

Access Paper or Ask Questions

Retrieving and ranking short medical questions with two stages neural matching model

Nov 16, 2020

Xiang Li, Xinyu Fu, Zheng Lu, Ruibin Bai, Uwe Aickelin, Peiming Ge, Gong Liu

Figure 1 for Retrieving and ranking short medical questions with two stages neural matching model

Figure 2 for Retrieving and ranking short medical questions with two stages neural matching model

Figure 3 for Retrieving and ranking short medical questions with two stages neural matching model

Figure 4 for Retrieving and ranking short medical questions with two stages neural matching model

Abstract:Internet hospital is a rising business thanks to recent advances in mobile web technology and high demand of health care services. Online medical services become increasingly popular and active. According to US data in 2018, 80 percent of internet users have asked health-related questions online. Numerous data is generated in unprecedented speed and scale. Those representative questions and answers in medical fields are valuable raw data sources for medical data mining. Automated machine interpretation on those sheer amount of data gives an opportunity to assist doctors to answer frequently asked medical-related questions from the perspective of information retrieval and machine learning approaches. In this work, we propose a novel two-stage framework for the semantic matching of query-level medical questions.

* 2019 IEEE Congress on Evolutionary Computation (CEC),Pages 873-879

Via

Access Paper or Ask Questions

Augmenting C. elegans Microscopic Dataset for Accelerated Pattern Recognition

May 31, 2019

Dali Wang, Zheng Lu, Zhirong Bao

Figure 1 for Augmenting C. elegans Microscopic Dataset for Accelerated Pattern Recognition

Figure 2 for Augmenting C. elegans Microscopic Dataset for Accelerated Pattern Recognition

Figure 3 for Augmenting C. elegans Microscopic Dataset for Accelerated Pattern Recognition

Figure 4 for Augmenting C. elegans Microscopic Dataset for Accelerated Pattern Recognition

Abstract:The detection of cell shape changes in 3D time-lapse images of complex tissues is an important task. However, it is a challenging and tedious task to establish a comprehensive dataset to improve the performance of deep learning models. In the paper, we present a deep learning approach to augment 3D live images of the Caenorhabditis elegans embryo, so that we can further speed up the specific structural pattern recognition. We use an unsupervised training over unlabeled images to generate supplementary datasets for further pattern recognition. Technically, we used Alex-style neural networks in a generative adversarial network framework to generate new datasets that have common features of the C. elegans membrane structure. We also made the dataset available for a broad scientific community.

Via

Access Paper or Ask Questions