Picture for Jiayuan Chen

Jiayuan Chen

Sherman

Not Search, But Scan: Benchmarking MLLMs on Scan-Oriented Academic Paper Reasoning

Add code
Mar 27, 2026
Viaarxiv icon

THEMIS: Towards Holistic Evaluation of MLLMs for Scientific Paper Fraud Forensics

Add code
Mar 26, 2026
Viaarxiv icon

A Scheduling Framework for Efficient MoE Inference on Edge GPU-NDP Systems

Add code
Jan 07, 2026
Viaarxiv icon

MedBench v4: A Robust and Scalable Benchmark for Evaluating Chinese Medical Language Models, Multimodal Models, and Intelligent Agents

Add code
Nov 19, 2025
Viaarxiv icon

TCM-5CEval: Extended Deep Evaluation Benchmark for LLM's Comprehensive Clinical Research Competence in Traditional Chinese Medicine

Add code
Nov 17, 2025
Viaarxiv icon

MedCalc-Eval and MedCalc-Env: Advancing Medical Calculation Capabilities of Large Language Models

Add code
Oct 31, 2025
Viaarxiv icon

Building a Human-Verified Clinical Reasoning Dataset via a Human LLM Hybrid Pipeline for Trustworthy Medical AI

Add code
May 11, 2025
Viaarxiv icon

TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine

Add code
Mar 10, 2025
Figure 1 for TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine
Figure 2 for TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine
Figure 3 for TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine
Figure 4 for TCM-3CEval: A Triaxial Benchmark for Assessing Responses from Large Language Models in Traditional Chinese Medicine
Viaarxiv icon

Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies

Add code
Mar 10, 2025
Figure 1 for Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies
Figure 2 for Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies
Figure 3 for Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies
Figure 4 for Benchmarking Chinese Medical LLMs: A Medbench-based Analysis of Performance Gaps and Hierarchical Optimization Strategies
Viaarxiv icon

Predictive Modeling with Temporal Graphical Representation on Electronic Health Records

Add code
May 07, 2024
Figure 1 for Predictive Modeling with Temporal Graphical Representation on Electronic Health Records
Figure 2 for Predictive Modeling with Temporal Graphical Representation on Electronic Health Records
Figure 3 for Predictive Modeling with Temporal Graphical Representation on Electronic Health Records
Figure 4 for Predictive Modeling with Temporal Graphical Representation on Electronic Health Records
Viaarxiv icon