Picture for Yi Zhu

Yi Zhu

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Add code
Jan 08, 2025
Viaarxiv icon

Progressive Document-level Text Simplification via Large Language Models

Add code
Jan 07, 2025
Viaarxiv icon

Benchmarking Table Comprehension In The Wild

Add code
Dec 13, 2024
Viaarxiv icon

SparseGrasp: Robotic Grasping via 3D Semantic Gaussian Splatting from Sparse Multi-View RGB Images

Add code
Dec 03, 2024
Viaarxiv icon

VidMan: Exploiting Implicit Dynamics from Video Diffusion Model for Effective Robot Manipulation

Add code
Nov 14, 2024
Viaarxiv icon

Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge

Add code
Oct 09, 2024
Figure 1 for Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge
Figure 2 for Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge
Figure 3 for Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge
Figure 4 for Learn from Real: Reality Defender's Submission to ASVspoof5 Challenge
Viaarxiv icon

Differential Transformer

Add code
Oct 07, 2024
Figure 1 for Differential Transformer
Figure 2 for Differential Transformer
Figure 3 for Differential Transformer
Figure 4 for Differential Transformer
Viaarxiv icon

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Add code
Sep 26, 2024
Figure 1 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 2 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 3 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 4 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Viaarxiv icon

Cross-Organ Domain Adaptive Neural Network for Pancreatic Endoscopic Ultrasound Image Segmentation

Add code
Sep 07, 2024
Viaarxiv icon

UNIT: Unifying Image and Text Recognition in One Vision Encoder

Add code
Sep 06, 2024
Figure 1 for UNIT: Unifying Image and Text Recognition in One Vision Encoder
Figure 2 for UNIT: Unifying Image and Text Recognition in One Vision Encoder
Figure 3 for UNIT: Unifying Image and Text Recognition in One Vision Encoder
Figure 4 for UNIT: Unifying Image and Text Recognition in One Vision Encoder
Viaarxiv icon