Picture for Teng Wang

Teng Wang

GenHancer: Imperfect Generative Models are Secretly Strong Vision-Centric Enhancers

Add code
Mar 25, 2025
Viaarxiv icon

Towards Hierarchical Multi-Step Reward Models for Enhanced Reasoning in Large Language Models

Add code
Mar 19, 2025
Viaarxiv icon

LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos

Add code
Nov 29, 2024
Figure 1 for LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos
Figure 2 for LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos
Figure 3 for LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos
Figure 4 for LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware Omni-Modal Perception of Long Videos
Viaarxiv icon

BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving

Add code
Nov 26, 2024
Figure 1 for BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving
Figure 2 for BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving
Figure 3 for BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving
Figure 4 for BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving
Viaarxiv icon

ImagineNav: Prompting Vision-Language Models as Embodied Navigator through Scene Imagination

Add code
Oct 13, 2024
Viaarxiv icon

Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models

Add code
Oct 10, 2024
Figure 1 for Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
Figure 2 for Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
Figure 3 for Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
Figure 4 for Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models
Viaarxiv icon

Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts

Add code
Sep 17, 2024
Figure 1 for Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Figure 2 for Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Figure 3 for Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Figure 4 for Large Language Models are Good Multi-lingual Learners : When LLMs Meet Cross-lingual Prompts
Viaarxiv icon

Leveraging Large Language Models for Solving Rare MIP Challenges

Add code
Sep 03, 2024
Figure 1 for Leveraging Large Language Models for Solving Rare MIP Challenges
Figure 2 for Leveraging Large Language Models for Solving Rare MIP Challenges
Figure 3 for Leveraging Large Language Models for Solving Rare MIP Challenges
Figure 4 for Leveraging Large Language Models for Solving Rare MIP Challenges
Viaarxiv icon

Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models

Add code
Jul 16, 2024
Figure 1 for Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Figure 2 for Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Figure 3 for Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Figure 4 for Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models
Viaarxiv icon

LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition

Add code
Jul 09, 2024
Figure 1 for LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition
Figure 2 for LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition
Figure 3 for LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition
Figure 4 for LVLM-empowered Multi-modal Representation Learning for Visual Place Recognition
Viaarxiv icon