Picture for Yiming Yang

Yiming Yang

CEMSE Division, King Abdullah University of Science and Technology

Improve Vision Language Model Chain-of-thought Reasoning

Add code
Oct 21, 2024
Figure 1 for Improve Vision Language Model Chain-of-thought Reasoning
Figure 2 for Improve Vision Language Model Chain-of-thought Reasoning
Figure 3 for Improve Vision Language Model Chain-of-thought Reasoning
Figure 4 for Improve Vision Language Model Chain-of-thought Reasoning
Viaarxiv icon

Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo

Add code
Oct 02, 2024
Figure 1 for Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Figure 2 for Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Figure 3 for Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Figure 4 for Step-by-Step Reasoning for Math Problems via Twisted Sequential Monte Carlo
Viaarxiv icon

An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models

Add code
Aug 01, 2024
Figure 1 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 2 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 3 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Figure 4 for An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models
Viaarxiv icon

Lean-STaR: Learning to Interleave Thinking and Proving

Add code
Jul 14, 2024
Viaarxiv icon

Few-shot Personalization of LLMs with Mis-aligned Responses

Add code
Jun 26, 2024
Viaarxiv icon

Learning to Correct for QA Reasoning with Black-box LLMs

Add code
Jun 26, 2024
Viaarxiv icon

Self-Play Preference Optimization for Language Model Alignment

Add code
May 01, 2024
Viaarxiv icon

A Fully Screen-Printed Vanadium-Dioxide Switches Based Wideband Reconfigurable Intelligent Surface for 5G Bands

Add code
Apr 30, 2024
Viaarxiv icon

Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward

Add code
Apr 02, 2024
Figure 1 for Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Figure 2 for Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Figure 3 for Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Figure 4 for Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward
Viaarxiv icon

Easy-to-Hard Generalization: Scalable Alignment Beyond Human Supervision

Add code
Mar 14, 2024
Viaarxiv icon