Picture for Yihe Deng

Yihe Deng

OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement

Add code
Mar 21, 2025
Viaarxiv icon

DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails

Add code
Feb 07, 2025
Figure 1 for DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
Figure 2 for DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
Figure 3 for DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
Figure 4 for DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails
Viaarxiv icon

Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning

Add code
Oct 29, 2024
Viaarxiv icon

MIRAI: Evaluating LLM Agents for Event Forecasting

Add code
Jul 01, 2024
Figure 1 for MIRAI: Evaluating LLM Agents for Event Forecasting
Figure 2 for MIRAI: Evaluating LLM Agents for Event Forecasting
Figure 3 for MIRAI: Evaluating LLM Agents for Event Forecasting
Figure 4 for MIRAI: Evaluating LLM Agents for Event Forecasting
Viaarxiv icon

Enhancing Large Vision Language Models with Self-Training on Image Comprehension

Add code
May 30, 2024
Viaarxiv icon

Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance

Add code
Feb 13, 2024
Viaarxiv icon

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Add code
Jan 02, 2024
Figure 1 for Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Figure 2 for Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Figure 3 for Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Figure 4 for Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Viaarxiv icon

Risk Bounds of Accelerated SGD for Overparameterized Linear Regression

Add code
Nov 23, 2023
Viaarxiv icon

Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves

Add code
Nov 07, 2023
Viaarxiv icon

Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP

Add code
Oct 02, 2023
Viaarxiv icon