Picture for Shufan Li

Shufan Li

MedMax: Mixed-Modal Instruction Tuning for Training Biomedical Assistants

Add code
Dec 17, 2024
Viaarxiv icon

OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows

Add code
Dec 02, 2024
Viaarxiv icon

Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory

Add code
Nov 25, 2024
Figure 1 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory
Figure 2 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory
Figure 3 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory
Figure 4 for Unlocking the Potential of Text-to-Image Diffusion with PAC-Bayesian Theory
Viaarxiv icon

SegLLM: Multi-round Reasoning Segmentation

Add code
Oct 24, 2024
Figure 1 for SegLLM: Multi-round Reasoning Segmentation
Figure 2 for SegLLM: Multi-round Reasoning Segmentation
Figure 3 for SegLLM: Multi-round Reasoning Segmentation
Figure 4 for SegLLM: Multi-round Reasoning Segmentation
Viaarxiv icon

PopAlign: Population-Level Alignment for Fair Text-to-Image Generation

Add code
Jun 28, 2024
Viaarxiv icon

Aligning Diffusion Models by Optimizing Human Utility

Add code
Apr 06, 2024
Viaarxiv icon

xT: Nested Tokenization for Larger Context in Large Images

Add code
Mar 04, 2024
Figure 1 for xT: Nested Tokenization for Larger Context in Large Images
Figure 2 for xT: Nested Tokenization for Larger Context in Large Images
Figure 3 for xT: Nested Tokenization for Larger Context in Large Images
Figure 4 for xT: Nested Tokenization for Larger Context in Large Images
Viaarxiv icon

Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data

Add code
Feb 08, 2024
Viaarxiv icon

InstructAny2Pix: Flexible Visual Editing via Multimodal Instruction Following

Add code
Dec 30, 2023
Viaarxiv icon

Hierarchical Open-vocabulary Universal Image Segmentation

Add code
Jul 03, 2023
Viaarxiv icon