Picture for Jun Gao

Jun Gao

NVIDIA, University of Toronto, Vector Institute

InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models

Add code
Dec 05, 2024
Viaarxiv icon

Interleaved-Modal Chain-of-Thought

Add code
Nov 29, 2024
Figure 1 for Interleaved-Modal Chain-of-Thought
Figure 2 for Interleaved-Modal Chain-of-Thought
Figure 3 for Interleaved-Modal Chain-of-Thought
Figure 4 for Interleaved-Modal Chain-of-Thought
Viaarxiv icon

HelloMeme: Integrating Spatial Knitting Attentions to Embed High-Level and Fidelity-Rich Conditions in Diffusion Models

Add code
Oct 30, 2024
Viaarxiv icon

BSG4Bot: Efficient Bot Detection based on Biased Heterogeneous Subgraphs

Add code
Oct 07, 2024
Viaarxiv icon

SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes

Add code
Sep 30, 2024
Figure 1 for SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes
Figure 2 for SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes
Figure 3 for SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes
Figure 4 for SpaceMesh: A Continuous Representation for Learning Manifold Surface Meshes
Viaarxiv icon

DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training

Add code
Aug 01, 2024
Figure 1 for DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training
Figure 2 for DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training
Figure 3 for DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training
Figure 4 for DNTextSpotter: Arbitrary-Shaped Scene Text Spotting via Improved Denoising Training
Viaarxiv icon

AIM: Let Any Multi-modal Large Language Models Embrace Efficient In-Context Learning

Add code
Jun 11, 2024
Figure 1 for AIM: Let Any Multi-modal Large Language Models Embrace Efficient In-Context Learning
Figure 2 for AIM: Let Any Multi-modal Large Language Models Embrace Efficient In-Context Learning
Figure 3 for AIM: Let Any Multi-modal Large Language Models Embrace Efficient In-Context Learning
Figure 4 for AIM: Let Any Multi-modal Large Language Models Embrace Efficient In-Context Learning
Viaarxiv icon

Guiding ChatGPT to Generate Salient Domain Summaries

Add code
Jun 03, 2024
Viaarxiv icon

Unifying Demonstration Selection and Compression for In-Context Learning

Add code
May 27, 2024
Viaarxiv icon

SelfCP: Compressing Long Prompt to 1/12 Using the Frozen Large Language Model Itself

Add code
May 27, 2024
Viaarxiv icon