Picture for Yifan Yang

Yifan Yang

Towards Responsible Evaluation for Text-to-Speech

Add code
Oct 08, 2025
Viaarxiv icon

Diffusion^2: Turning 3D Environments into Radio Frequency Heatmaps

Add code
Oct 02, 2025
Viaarxiv icon

VidGuard-R1: AI-Generated Video Detection and Explanation via Reasoning MLLMs and RL

Add code
Oct 02, 2025
Viaarxiv icon

InfiAgent: Self-Evolving Pyramid Agent Framework for Infinite Scenarios

Add code
Sep 26, 2025
Viaarxiv icon

Memorization in Large Language Models in Medicine: Prevalence, Characteristics, and Implications

Add code
Sep 10, 2025
Viaarxiv icon

FPC-VLA: A Vision-Language-Action Framework with a Supervisor for Failure Prediction and Correction

Add code
Sep 04, 2025
Figure 1 for FPC-VLA: A Vision-Language-Action Framework with a Supervisor for Failure Prediction and Correction
Figure 2 for FPC-VLA: A Vision-Language-Action Framework with a Supervisor for Failure Prediction and Correction
Figure 3 for FPC-VLA: A Vision-Language-Action Framework with a Supervisor for Failure Prediction and Correction
Figure 4 for FPC-VLA: A Vision-Language-Action Framework with a Supervisor for Failure Prediction and Correction
Viaarxiv icon

P/D-Device: Disaggregated Large Language Model between Cloud and Devices

Add code
Aug 12, 2025
Viaarxiv icon

Preview WB-DH: Towards Whole Body Digital Human Bench for the Generation of Whole-body Talking Avatar Videos

Add code
Aug 12, 2025
Viaarxiv icon

Phi-Ground Tech Report: Advancing Perception in GUI Grounding

Add code
Jul 31, 2025
Viaarxiv icon

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Add code
Jul 17, 2025
Viaarxiv icon