Picture for Zheng Zhao

Zheng Zhao

What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific Presentations

Add code
Feb 12, 2025
Viaarxiv icon

Solving Linear-Gaussian Bayesian Inverse Problems with Decoupled Diffusion Sequential Monte Carlo

Add code
Feb 10, 2025
Viaarxiv icon

HiMix: Reducing Computational Complexity in Large Vision-Language Models

Add code
Jan 17, 2025
Viaarxiv icon

Manga Generation via Layout-controllable Diffusion

Add code
Dec 26, 2024
Viaarxiv icon

InstructSeg: Unifying Instructed Visual Segmentation with Multi-modal Large Language Models

Add code
Dec 18, 2024
Viaarxiv icon

LinVT: Empower Your Image-level Large Language Model to Understand Videos

Add code
Dec 06, 2024
Viaarxiv icon

TASR: Timestep-Aware Diffusion Model for Image Super-Resolution

Add code
Dec 04, 2024
Viaarxiv icon

RFSR: Improving ISR Diffusion Models via Reward Feedback Learning

Add code
Dec 04, 2024
Viaarxiv icon

HyperSeg: Towards Universal Visual Segmentation with Large Language Model

Add code
Nov 26, 2024
Viaarxiv icon

Layer by Layer: Uncovering Where Multi-Task Learning Happens in Instruction-Tuned Large Language Models

Add code
Oct 25, 2024
Viaarxiv icon