Picture for Huan Wang

Huan Wang

Stephen

FreeBlend: Advancing Concept Blending with Staged Feedback-Driven Interpolation Diffusion

Add code
Feb 08, 2025
Viaarxiv icon

Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs

Add code
Jan 31, 2025
Figure 1 for Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs
Figure 2 for Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs
Figure 3 for Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs
Figure 4 for Poison as Cure: Visual Noise for Mitigating Object Hallucinations in LVMs
Viaarxiv icon

Dynamic Token Reduction during Generation for Vision Language Models

Add code
Jan 24, 2025
Viaarxiv icon

TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action

Add code
Dec 10, 2024
Figure 1 for TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action
Figure 2 for TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action
Figure 3 for TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action
Figure 4 for TACO: Learning Multi-modal Action Models with Synthetic Chains-of-Thought-and-Action
Viaarxiv icon

Slicing Vision Transformer for Flexible Inference

Add code
Dec 06, 2024
Viaarxiv icon

Is Oracle Pruning the True Oracle?

Add code
Nov 28, 2024
Viaarxiv icon

Individual Content and Motion Dynamics Preserved Pruning for Video Diffusion Models

Add code
Nov 27, 2024
Viaarxiv icon

DyCoke: Dynamic Compression of Tokens for Fast Video Large Language Models

Add code
Nov 22, 2024
Viaarxiv icon

SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs

Add code
Nov 20, 2024
Figure 1 for SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs
Figure 2 for SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs
Figure 3 for SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs
Figure 4 for SpecTool: A Benchmark for Characterizing Errors in Tool-Use LLMs
Viaarxiv icon

Language Models are Hidden Reasoners: Unlocking Latent Reasoning Capabilities via Self-Rewarding

Add code
Nov 06, 2024
Viaarxiv icon