Picture for Jiaqi Wang

Jiaqi Wang

Michael Pokorny

MM-IFEngine: Towards Multimodal Instruction Following

Add code
Apr 10, 2025
Viaarxiv icon

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

Add code
Apr 08, 2025
Viaarxiv icon

Multi-label classification for multi-temporal, multi-spatial coral reef condition monitoring using vision foundation model with adapter learning

Add code
Mar 29, 2025
Viaarxiv icon

DualToken: Towards Unifying Visual Understanding and Generation with Dual Visual Vocabularies

Add code
Mar 19, 2025
Viaarxiv icon

Unified Reward Model for Multimodal Understanding and Generation

Add code
Mar 07, 2025
Viaarxiv icon

Visual-RFT: Visual Reinforcement Fine-Tuning

Add code
Mar 03, 2025
Viaarxiv icon

OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference

Add code
Feb 25, 2025
Viaarxiv icon

DivIL: Unveiling and Addressing Over-Invariance for Out-of- Distribution Generalization

Add code
Feb 18, 2025
Viaarxiv icon

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

Add code
Feb 18, 2025
Viaarxiv icon

Oversmoothing as Loss of Sign: Towards Structural Balance in Graph Neural Networks

Add code
Feb 17, 2025
Viaarxiv icon