Picture for Long Lian

Long Lian

Describe Anything: Detailed Localized Image and Video Captioning

Add code
Apr 22, 2025
Viaarxiv icon

Learning Adaptive Parallel Reasoning with Language Models

Add code
Apr 21, 2025
Viaarxiv icon

TULIP: Towards Unified Language-Image Pretraining

Add code
Mar 19, 2025
Viaarxiv icon

Atlas: Multi-Scale Attention Improves Long Context Image Modeling

Add code
Mar 16, 2025
Viaarxiv icon

Rethinking Patch Dependence for Masked Autoencoders

Add code
Jan 25, 2024
Viaarxiv icon

Unsupervised Universal Image Segmentation

Add code
Dec 28, 2023
Figure 1 for Unsupervised Universal Image Segmentation
Figure 2 for Unsupervised Universal Image Segmentation
Figure 3 for Unsupervised Universal Image Segmentation
Figure 4 for Unsupervised Universal Image Segmentation
Viaarxiv icon

Self-correcting LLM-controlled Diffusion Models

Add code
Nov 27, 2023
Viaarxiv icon

LLM-grounded Video Diffusion Models

Add code
Oct 02, 2023
Viaarxiv icon

LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models

Add code
May 23, 2023
Figure 1 for LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Figure 2 for LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Figure 3 for LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Figure 4 for LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image Diffusion Models with Large Language Models
Viaarxiv icon

Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping

Add code
Apr 17, 2023
Figure 1 for Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping
Figure 2 for Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping
Figure 3 for Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping
Figure 4 for Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping
Viaarxiv icon