Picture for Linjie Yang

Linjie Yang

COCONut-PanCap: Joint Panoptic Segmentation and Grounded Captions for Fine-Grained Understanding and Generation

Add code
Feb 04, 2025
Viaarxiv icon

Dual Diffusion for Unified Image Generation and Understanding

Add code
Dec 31, 2024
Viaarxiv icon

Fast Prompt Alignment for Text-to-Image Generation

Add code
Dec 11, 2024
Viaarxiv icon

VMAS: Video-to-Music Generation via Semantic Alignment in Web Music Videos

Add code
Sep 11, 2024
Viaarxiv icon

LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation

Add code
Sep 09, 2024
Figure 1 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Figure 2 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Figure 3 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Figure 4 for LSVOS Challenge Report: Large-scale Complex and Long Video Object Segmentation
Viaarxiv icon

Autoregressive Pretraining with Mamba in Vision

Add code
Jun 11, 2024
Figure 1 for Autoregressive Pretraining with Mamba in Vision
Figure 2 for Autoregressive Pretraining with Mamba in Vision
Figure 3 for Autoregressive Pretraining with Mamba in Vision
Figure 4 for Autoregressive Pretraining with Mamba in Vision
Viaarxiv icon

Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters

Add code
Mar 05, 2024
Figure 1 for Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters
Figure 2 for Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters
Figure 3 for Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters
Figure 4 for Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters
Viaarxiv icon

Video Recognition in Portrait Mode

Add code
Dec 21, 2023
Viaarxiv icon

Shot2Story20K: A New Benchmark for Comprehensive Understanding of Multi-shot Videos

Add code
Dec 19, 2023
Viaarxiv icon

Video-Teller: Enhancing Cross-Modal Generation with Fusion and Decoupling

Add code
Oct 11, 2023
Viaarxiv icon