Picture for Yuxiao Chen

Yuxiao Chen

DreamDrive: Generative 4D Scene Modeling from Street View Images

Add code
Jan 03, 2025
Figure 1 for DreamDrive: Generative 4D Scene Modeling from Street View Images
Figure 2 for DreamDrive: Generative 4D Scene Modeling from Street View Images
Figure 3 for DreamDrive: Generative 4D Scene Modeling from Street View Images
Figure 4 for DreamDrive: Generative 4D Scene Modeling from Street View Images
Viaarxiv icon

STORM: Spatio-Temporal Reconstruction Model for Large-Scale Outdoor Scenes

Add code
Dec 31, 2024
Viaarxiv icon

Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models

Add code
Dec 05, 2024
Figure 1 for Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models
Figure 2 for Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models
Figure 3 for Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models
Figure 4 for Closed-Loop Supervised Fine-Tuning of Tokenized Traffic Models
Viaarxiv icon

Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection

Add code
Nov 17, 2024
Figure 1 for Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
Figure 2 for Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
Figure 3 for Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
Figure 4 for Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
Viaarxiv icon

Optimal Defenses Against Gradient Reconstruction Attacks

Add code
Nov 06, 2024
Viaarxiv icon

Gen-Drive: Enhancing Diffusion Generative Driving Policies with Reward Modeling and Reinforcement Learning Fine-tuning

Add code
Oct 08, 2024
Viaarxiv icon

Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment

Add code
Sep 22, 2024
Figure 1 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Figure 2 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Figure 3 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Figure 4 for Learning to Localize Actions in Instructional Videos with LLM-Based Multi-Pathway Text-Video Alignment
Viaarxiv icon

Promptable Closed-loop Traffic Simulation

Add code
Sep 09, 2024
Figure 1 for Promptable Closed-loop Traffic Simulation
Figure 2 for Promptable Closed-loop Traffic Simulation
Figure 3 for Promptable Closed-loop Traffic Simulation
Viaarxiv icon

Wolf: Captioning Everything with a World Summarization Framework

Add code
Jul 26, 2024
Figure 1 for Wolf: Captioning Everything with a World Summarization Framework
Figure 2 for Wolf: Captioning Everything with a World Summarization Framework
Figure 3 for Wolf: Captioning Everything with a World Summarization Framework
Figure 4 for Wolf: Captioning Everything with a World Summarization Framework
Viaarxiv icon

Tokenize the World into Object-level Knowledge to Address Long-tail Events in Autonomous Driving

Add code
Jul 01, 2024
Viaarxiv icon