Picture for Boyuan Jiang

Boyuan Jiang

DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation

Add code
Dec 04, 2024
Figure 1 for DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Figure 2 for DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Figure 3 for DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Figure 4 for DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Viaarxiv icon

Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing

Add code
Nov 26, 2024
Figure 1 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Figure 2 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Figure 3 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Figure 4 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Viaarxiv icon

FitDiT: Advancing the Authentic Garment Details for High-fidelity Virtual Try-on

Add code
Nov 22, 2024
Viaarxiv icon

VIVID-10M: A Dataset and Baseline for Versatile and Interactive Video Local Editing

Add code
Nov 22, 2024
Viaarxiv icon

Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content

Add code
Oct 10, 2024
Viaarxiv icon

VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding

Add code
Aug 27, 2024
Viaarxiv icon

Oracle Bone Inscriptions Multi-modal Dataset

Add code
Jul 04, 2024
Viaarxiv icon

NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models

Add code
May 31, 2024
Figure 1 for NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
Figure 2 for NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
Figure 3 for NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
Figure 4 for NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
Viaarxiv icon

M2-CLIP: A Multimodal, Multi-task Adapting Framework for Video Action Recognition

Add code
Jan 22, 2024
Viaarxiv icon

PortraitBooth: A Versatile Portrait Model for Fast Identity-preserved Personalization

Add code
Dec 11, 2023
Viaarxiv icon