Picture for Han Hu

Han Hu

University of Toronto

Understanding vs. Generation: Navigating Optimization Dilemma in Multimodal Models

Add code
Feb 17, 2026
Viaarxiv icon

FAIL: Flow Matching Adversarial Imitation Learning for Image Generation

Add code
Feb 12, 2026
Viaarxiv icon

GenArena: How Can We Achieve Human-Aligned Evaluation for Visual Generation Tasks?

Add code
Feb 05, 2026
Viaarxiv icon

Seeing Is Believing? A Benchmark for Multimodal Large Language Models on Visual Illusions and Anomalies

Add code
Feb 02, 2026
Viaarxiv icon

Streaming-dLLM: Accelerating Diffusion LLMs via Suffix Pruning and Dynamic Decoding

Add code
Jan 27, 2026
Viaarxiv icon

Omni-directional attention mechanism based on Mamba for speech separation

Add code
Jan 23, 2026
Viaarxiv icon

From Text to Simulation: A Multi-Agent LLM Workflow for Automated Chemical Process Design

Add code
Jan 11, 2026
Viaarxiv icon

Focal-RegionFace: Generating Fine-Grained Multi-attribute Descriptions for Arbitrarily Selected Face Focal Regions

Add code
Jan 01, 2026
Viaarxiv icon

AgentMath: Empowering Mathematical Reasoning for Large Language Models via Tool-Augmented Agent

Add code
Dec 23, 2025
Viaarxiv icon

Distribution Matching Variational AutoEncoder

Add code
Dec 08, 2025
Viaarxiv icon