Picture for Mingyu Ding

Mingyu Ding

DexHandDiff: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation

Add code
Dec 11, 2024
Viaarxiv icon

Moto: Latent Motion Token as the Bridging Language for Robot Manipulation

Add code
Dec 05, 2024
Figure 1 for Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
Figure 2 for Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
Figure 3 for Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
Figure 4 for Moto: Latent Motion Token as the Bridging Language for Robot Manipulation
Viaarxiv icon

GRAPE: Generalizing Robot Policy via Preference Alignment

Add code
Nov 28, 2024
Figure 1 for GRAPE: Generalizing Robot Policy via Preference Alignment
Figure 2 for GRAPE: Generalizing Robot Policy via Preference Alignment
Figure 3 for GRAPE: Generalizing Robot Policy via Preference Alignment
Figure 4 for GRAPE: Generalizing Robot Policy via Preference Alignment
Viaarxiv icon

DexDiffuser: Interaction-aware Diffusion Planning for Adaptive Dexterous Manipulation

Add code
Nov 27, 2024
Viaarxiv icon

DexH2R: Task-oriented Dexterous Manipulation from Human to Robots

Add code
Nov 07, 2024
Viaarxiv icon

X-Drive: Cross-modality consistent multi-sensor data synthesis for driving scenarios

Add code
Nov 02, 2024
Viaarxiv icon

Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning

Add code
Oct 31, 2024
Figure 1 for Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning
Figure 2 for Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning
Figure 3 for Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning
Figure 4 for Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning
Viaarxiv icon

MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts

Add code
Oct 30, 2024
Figure 1 for MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Figure 2 for MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Figure 3 for MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Figure 4 for MoLE: Enhancing Human-centric Text-to-image Diffusion via Mixture of Low-rank Experts
Viaarxiv icon

CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians

Add code
Oct 28, 2024
Figure 1 for CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians
Figure 2 for CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians
Figure 3 for CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians
Figure 4 for CompGS: Unleashing 2D Compositionality for Compositional Text-to-3D via Dynamically Optimizing 3D Gaussians
Viaarxiv icon

MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models

Add code
Oct 14, 2024
Figure 1 for MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Figure 2 for MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Figure 3 for MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Figure 4 for MMIE: Massive Multimodal Interleaved Comprehension Benchmark for Large Vision-Language Models
Viaarxiv icon