Picture for Chongjian Ge

Chongjian Ge

CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians

Add code
Oct 28, 2024
Viaarxiv icon

WOMD-Reasoning: A Large-Scale Language Dataset for Interaction and Driving Intentions Reasoning

Add code
Jul 05, 2024
Viaarxiv icon

PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

Add code
Mar 07, 2024
Viaarxiv icon

RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis

Add code
Feb 25, 2024
Viaarxiv icon

Advancing Vision Transformers with Group-Mix Attention

Add code
Nov 26, 2023
Viaarxiv icon

Large Language Models as Automated Aligners for benchmarking Vision-Language Models

Add code
Nov 24, 2023
Viaarxiv icon

InstructDET: Diversifying Referring Object Detection with Generalized Instructions

Add code
Oct 17, 2023
Viaarxiv icon

PixArt-$α$: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

Add code
Oct 16, 2023
Viaarxiv icon

Speed Co-Augmentation for Unsupervised Audio-Visual Pre-training

Add code
Sep 25, 2023
Viaarxiv icon

MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation

Add code
Apr 19, 2023
Viaarxiv icon