Picture for Yaqi Zhao

Yaqi Zhao

Towards Precise Scaling Laws for Video Diffusion Transformers

Add code
Nov 25, 2024
Viaarxiv icon

Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge

Add code
Nov 25, 2024
Viaarxiv icon

Baichuan-Omni Technical Report

Add code
Oct 11, 2024
Figure 1 for Baichuan-Omni Technical Report
Figure 2 for Baichuan-Omni Technical Report
Figure 3 for Baichuan-Omni Technical Report
Figure 4 for Baichuan-Omni Technical Report
Viaarxiv icon

SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs

Add code
Aug 21, 2024
Viaarxiv icon