Picture for Jin Gao

Jin Gao

HSTrack: Bootstrap End-to-End Multi-Camera 3D Multi-object Tracking with Hybrid Supervision

Add code
Nov 11, 2024
Viaarxiv icon

VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization

Add code
Nov 03, 2024
Figure 1 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Figure 2 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Figure 3 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Figure 4 for VQ-Map: Bird's-Eye-View Map Layout Estimation in Tokenized Discrete Space via Vector Quantization
Viaarxiv icon

Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions

Add code
Aug 05, 2024
Figure 1 for Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions
Figure 2 for Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions
Figure 3 for Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions
Figure 4 for Dissecting Dissonance: Benchmarking Large Multimodal Models Against Self-Contradictory Instructions
Viaarxiv icon

Temporal Correlation Meets Embedding: Towards a 2nd Generation of JDE-based Real-Time Multi-Object Tracking

Add code
Jul 19, 2024
Viaarxiv icon

Animate3D: Animating Any 3D Model with Multi-view Video Diffusion

Add code
Jul 16, 2024
Figure 1 for Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Figure 2 for Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Figure 3 for Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Figure 4 for Animate3D: Animating Any 3D Model with Multi-view Video Diffusion
Viaarxiv icon

Graspness Discovery in Clutters for Fast and Accurate Grasp Detection

Add code
Jun 17, 2024
Viaarxiv icon

Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training

Add code
Apr 18, 2024
Figure 1 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Figure 2 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Figure 3 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Figure 4 for Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Viaarxiv icon

BEV2PR: BEV-Enhanced Visual Place Recognition with Structural Cues

Add code
Mar 11, 2024
Viaarxiv icon

Multi-Generative Agent Collective Decision-Making in Urban Planning: A Case Study for Kendall Square Renovation

Add code
Feb 17, 2024
Figure 1 for Multi-Generative Agent Collective Decision-Making in Urban Planning: A Case Study for Kendall Square Renovation
Figure 2 for Multi-Generative Agent Collective Decision-Making in Urban Planning: A Case Study for Kendall Square Renovation
Figure 3 for Multi-Generative Agent Collective Decision-Making in Urban Planning: A Case Study for Kendall Square Renovation
Figure 4 for Multi-Generative Agent Collective Decision-Making in Urban Planning: A Case Study for Kendall Square Renovation
Viaarxiv icon

Data-Centric Foundation Models in Computational Healthcare: A Survey

Add code
Jan 04, 2024
Figure 1 for Data-Centric Foundation Models in Computational Healthcare: A Survey
Figure 2 for Data-Centric Foundation Models in Computational Healthcare: A Survey
Figure 3 for Data-Centric Foundation Models in Computational Healthcare: A Survey
Figure 4 for Data-Centric Foundation Models in Computational Healthcare: A Survey
Viaarxiv icon