Picture for Yizhi Wang

Yizhi Wang

To What Extent Do Token-Level Representations from Pathology Foundation Models Improve Dense Prediction?

Add code
Feb 03, 2026
Viaarxiv icon

Envision: Embodied Visual Planning via Goal-Imagery Video Diffusion

Add code
Dec 27, 2025
Viaarxiv icon

Training Multimodal Large Reasoning Models Needs Better Thoughts: A Three-Stage Framework for Long Chain-of-Thought Synthesis and Selection

Add code
Dec 22, 2025
Viaarxiv icon

VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization

Add code
Dec 18, 2025
Figure 1 for VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
Figure 2 for VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
Figure 3 for VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
Figure 4 for VIVA: VLM-Guided Instruction-Based Video Editing with Reward Optimization
Viaarxiv icon

StainNet: A Special Staining Self-Supervised Vision Transformer for Computational Pathology

Add code
Dec 11, 2025
Viaarxiv icon

MAGREF: Masked Guidance for Any-Reference Video Generation

Add code
May 29, 2025
Viaarxiv icon

Subspecialty-Specific Foundation Model for Intelligent Gastrointestinal Pathology

Add code
May 28, 2025
Viaarxiv icon

ACT-R: Adaptive Camera Trajectories for 3D Reconstruction from Single Image

Add code
May 13, 2025
Viaarxiv icon

PathOrchestra: A Comprehensive Foundation Model for Computational Pathology with Over 100 Diverse Clinical-Grade Tasks

Add code
Mar 31, 2025
Viaarxiv icon

CINEMA: Coherent Multi-Subject Video Generation via MLLM-Based Guidance

Add code
Mar 13, 2025
Viaarxiv icon