Picture for Zhenguo Li

Zhenguo Li

Dual Risk Minimization: Towards Next-Level Robustness in Fine-tuning Zero-Shot Models

Add code
Nov 29, 2024
Viaarxiv icon

Efficient Multi-modal Large Language Models via Visual Token Grouping

Add code
Nov 26, 2024
Figure 1 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Figure 2 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Figure 3 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Figure 4 for Efficient Multi-modal Large Language Models via Visual Token Grouping
Viaarxiv icon

MagicDriveDiT: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control

Add code
Nov 21, 2024
Viaarxiv icon

Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration

Add code
Oct 22, 2024
Figure 1 for Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration
Figure 2 for Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration
Figure 3 for Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration
Figure 4 for Forewarned is Forearmed: Leveraging LLMs for Data Synthesis through Failure-Inducing Exploration
Viaarxiv icon

Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning

Add code
Oct 18, 2024
Viaarxiv icon

How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs

Add code
Oct 17, 2024
Figure 1 for How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
Figure 2 for How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
Figure 3 for How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
Figure 4 for How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs
Viaarxiv icon

DAPE V2: Process Attention Score as Feature Map for Length Extrapolation

Add code
Oct 07, 2024
Viaarxiv icon

Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding

Add code
Oct 02, 2024
Figure 1 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 2 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 3 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Figure 4 for Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Viaarxiv icon

EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions

Add code
Sep 26, 2024
Figure 1 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 2 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 3 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Figure 4 for EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions
Viaarxiv icon

CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration

Add code
Sep 17, 2024
Figure 1 for CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Figure 2 for CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Figure 3 for CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Figure 4 for CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration
Viaarxiv icon