Picture for Haodong Duan

Haodong Duan

VideoRoPE: What Makes for Good Video Rotary Position Embedding?

Add code
Feb 07, 2025
Viaarxiv icon

Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement

Add code
Jan 21, 2025
Figure 1 for Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
Figure 2 for Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
Figure 3 for Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
Figure 4 for Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement
Viaarxiv icon

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Add code
Jan 21, 2025
Figure 1 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Figure 2 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Figure 3 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Figure 4 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Viaarxiv icon

OVO-Bench: How Far is Your Video-LLMs from Real-World Online Video Understanding?

Add code
Jan 09, 2025
Viaarxiv icon

BoostStep: Boosting mathematical capability of Large Language Models via improved single-step reasoning

Add code
Jan 06, 2025
Viaarxiv icon

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Add code
Dec 12, 2024
Figure 1 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 2 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 3 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 4 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Viaarxiv icon

MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs

Add code
Nov 22, 2024
Figure 1 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 2 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 3 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Figure 4 for MME-Survey: A Comprehensive Survey on Evaluation of Multimodal LLMs
Viaarxiv icon

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

Add code
Oct 23, 2024
Figure 1 for MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Figure 2 for MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Figure 3 for MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Figure 4 for MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models
Viaarxiv icon

CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution

Add code
Oct 21, 2024
Figure 1 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Figure 2 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Figure 3 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Figure 4 for CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution
Viaarxiv icon

ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs

Add code
Oct 16, 2024
Figure 1 for ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs
Figure 2 for ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs
Figure 3 for ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs
Figure 4 for ProSA: Assessing and Understanding the Prompt Sensitivity of LLMs
Viaarxiv icon