Picture for Wenwei Zhang

Wenwei Zhang

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

Add code
Feb 10, 2025
Viaarxiv icon

InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model

Add code
Jan 21, 2025
Figure 1 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Figure 2 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Figure 3 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Figure 4 for InternLM-XComposer2.5-Reward: A Simple Yet Effective Multi-Modal Reward Model
Viaarxiv icon

Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives

Add code
Jan 07, 2025
Figure 1 for Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives
Figure 2 for Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives
Figure 3 for Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives
Figure 4 for Are VLMs Ready for Autonomous Driving? An Empirical Study from the Reliability, Data, and Metric Perspectives
Viaarxiv icon

LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving

Add code
Jan 07, 2025
Viaarxiv icon

Are Your LLMs Capable of Stable Reasoning?

Add code
Dec 17, 2024
Viaarxiv icon

InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions

Add code
Dec 12, 2024
Figure 1 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 2 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 3 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Figure 4 for InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
Viaarxiv icon

Training Language Models to Critique With Multi-agent Feedback

Add code
Oct 20, 2024
Figure 1 for Training Language Models to Critique With Multi-agent Feedback
Figure 2 for Training Language Models to Critique With Multi-agent Feedback
Figure 3 for Training Language Models to Critique With Multi-agent Feedback
Figure 4 for Training Language Models to Critique With Multi-agent Feedback
Viaarxiv icon

LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness

Add code
Sep 26, 2024
Figure 1 for LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Figure 2 for LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Figure 3 for LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Figure 4 for LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D-awareness
Viaarxiv icon

SLAM assisted 3D tracking system for laparoscopic surgery

Add code
Sep 18, 2024
Viaarxiv icon

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

Add code
Jul 29, 2024
Viaarxiv icon