Picture for Yue Wang

Yue Wang

BEV-ODOM: Reducing Scale Drift in Monocular Visual Odometry with BEV Representation

Add code
Nov 15, 2024
Figure 1 for BEV-ODOM: Reducing Scale Drift in Monocular Visual Odometry with BEV Representation
Figure 2 for BEV-ODOM: Reducing Scale Drift in Monocular Visual Odometry with BEV Representation
Figure 3 for BEV-ODOM: Reducing Scale Drift in Monocular Visual Odometry with BEV Representation
Figure 4 for BEV-ODOM: Reducing Scale Drift in Monocular Visual Odometry with BEV Representation
Viaarxiv icon

AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions

Add code
Oct 29, 2024
Figure 1 for AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
Figure 2 for AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
Figure 3 for AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
Figure 4 for AutoKaggle: A Multi-Agent Framework for Autonomous Data Science Competitions
Viaarxiv icon

Large Spatial Model: End-to-end Unposed Images to Semantic 3D

Add code
Oct 24, 2024
Figure 1 for Large Spatial Model: End-to-end Unposed Images to Semantic 3D
Figure 2 for Large Spatial Model: End-to-end Unposed Images to Semantic 3D
Figure 3 for Large Spatial Model: End-to-end Unposed Images to Semantic 3D
Figure 4 for Large Spatial Model: End-to-end Unposed Images to Semantic 3D
Viaarxiv icon

AttentionPainter: An Efficient and Adaptive Stroke Predictor for Scene Painting

Add code
Oct 21, 2024
Viaarxiv icon

Multimodal Policies with Physics-informed Representations

Add code
Oct 20, 2024
Viaarxiv icon

HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks

Add code
Oct 16, 2024
Figure 1 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Figure 2 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Figure 3 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Figure 4 for HumanEval-V: Evaluating Visual Understanding and Reasoning Abilities of Large Multimodal Models Through Coding Tasks
Viaarxiv icon

LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images

Add code
Oct 15, 2024
Figure 1 for LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images
Figure 2 for LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images
Figure 3 for LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images
Figure 4 for LoGS: Visual Localization via Gaussian Splatting with Fewer Training Images
Viaarxiv icon

Aria: An Open Multimodal Native Mixture-of-Experts Model

Add code
Oct 08, 2024
Figure 1 for Aria: An Open Multimodal Native Mixture-of-Experts Model
Figure 2 for Aria: An Open Multimodal Native Mixture-of-Experts Model
Figure 3 for Aria: An Open Multimodal Native Mixture-of-Experts Model
Figure 4 for Aria: An Open Multimodal Native Mixture-of-Experts Model
Viaarxiv icon

Scene Flow as a Partial Differential Equation

Add code
Oct 02, 2024
Figure 1 for Scene Flow as a Partial Differential Equation
Figure 2 for Scene Flow as a Partial Differential Equation
Figure 3 for Scene Flow as a Partial Differential Equation
Figure 4 for Scene Flow as a Partial Differential Equation
Viaarxiv icon

HR-Extreme: A High-Resolution Dataset for Extreme Weather Forecasting

Add code
Sep 27, 2024
Viaarxiv icon