Picture for Kaixiong Gong

Kaixiong Gong

AV-Odyssey Bench: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

Add code
Dec 03, 2024
Viaarxiv icon

BIFRÖST: 3D-Aware Image compositing with Language Instructions

Add code
Oct 24, 2024
Viaarxiv icon

Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities

Add code
Jan 25, 2024
Viaarxiv icon

Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors

Add code
Dec 07, 2023
Figure 1 for Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors
Figure 2 for Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors
Figure 3 for Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors
Figure 4 for Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors
Viaarxiv icon

OneLLM: One Framework to Align All Modalities with Language

Add code
Dec 06, 2023
Viaarxiv icon

Towards Unified and Effective Domain Generalization

Add code
Oct 16, 2023
Viaarxiv icon

Meta-Transformer: A Unified Framework for Multimodal Learning

Add code
Jul 20, 2023
Viaarxiv icon

Improving Transferability for Domain Adaptive Detection Transformers

Add code
Apr 29, 2022
Figure 1 for Improving Transferability for Domain Adaptive Detection Transformers
Figure 2 for Improving Transferability for Domain Adaptive Detection Transformers
Figure 3 for Improving Transferability for Domain Adaptive Detection Transformers
Figure 4 for Improving Transferability for Domain Adaptive Detection Transformers
Viaarxiv icon

Pareto Domain Adaptation

Add code
Dec 09, 2021
Figure 1 for Pareto Domain Adaptation
Figure 2 for Pareto Domain Adaptation
Figure 3 for Pareto Domain Adaptation
Figure 4 for Pareto Domain Adaptation
Viaarxiv icon

MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

Add code
Apr 07, 2021
Figure 1 for MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition
Figure 2 for MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition
Figure 3 for MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition
Figure 4 for MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition
Viaarxiv icon