Picture for Weixin Li

Weixin Li

Amy

FutureVLA: Joint Visuomotor Prediction for Vision-Language-Action Model

Add code
Mar 11, 2026
Viaarxiv icon

Memory-Guided View Refinement for Dynamic Human-in-the-loop EQA

Add code
Mar 10, 2026
Viaarxiv icon

MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation

Add code
Nov 14, 2025
Figure 1 for MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation
Figure 2 for MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation
Figure 3 for MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation
Figure 4 for MSGNav: Unleashing the Power of Multi-modal 3D Scene Graph for Zero-Shot Embodied Navigation
Viaarxiv icon

PolySim: Bridging the Sim-to-Real Gap for Humanoid Control via Multi-Simulator Dynamics Randomization

Add code
Oct 02, 2025
Figure 1 for PolySim: Bridging the Sim-to-Real Gap for Humanoid Control via Multi-Simulator Dynamics Randomization
Figure 2 for PolySim: Bridging the Sim-to-Real Gap for Humanoid Control via Multi-Simulator Dynamics Randomization
Figure 3 for PolySim: Bridging the Sim-to-Real Gap for Humanoid Control via Multi-Simulator Dynamics Randomization
Figure 4 for PolySim: Bridging the Sim-to-Real Gap for Humanoid Control via Multi-Simulator Dynamics Randomization
Viaarxiv icon

Multi-Grained Compositional Visual Clue Learning for Image Intent Recognition

Add code
Apr 25, 2025
Figure 1 for Multi-Grained Compositional Visual Clue Learning for Image Intent Recognition
Figure 2 for Multi-Grained Compositional Visual Clue Learning for Image Intent Recognition
Figure 3 for Multi-Grained Compositional Visual Clue Learning for Image Intent Recognition
Figure 4 for Multi-Grained Compositional Visual Clue Learning for Image Intent Recognition
Viaarxiv icon

Generating Editable Head Avatars with 3D Gaussian GANs

Add code
Dec 26, 2024
Figure 1 for Generating Editable Head Avatars with 3D Gaussian GANs
Figure 2 for Generating Editable Head Avatars with 3D Gaussian GANs
Figure 3 for Generating Editable Head Avatars with 3D Gaussian GANs
Figure 4 for Generating Editable Head Avatars with 3D Gaussian GANs
Viaarxiv icon

Consistent Diffusion: Denoising Diffusion Model with Data-Consistent Training for Image Restoration

Add code
Dec 17, 2024
Figure 1 for Consistent Diffusion: Denoising Diffusion Model with Data-Consistent Training for Image Restoration
Figure 2 for Consistent Diffusion: Denoising Diffusion Model with Data-Consistent Training for Image Restoration
Figure 3 for Consistent Diffusion: Denoising Diffusion Model with Data-Consistent Training for Image Restoration
Figure 4 for Consistent Diffusion: Denoising Diffusion Model with Data-Consistent Training for Image Restoration
Viaarxiv icon

Leveraging Predicate and Triplet Learning for Scene Graph Generation

Add code
Jun 04, 2024
Figure 1 for Leveraging Predicate and Triplet Learning for Scene Graph Generation
Figure 2 for Leveraging Predicate and Triplet Learning for Scene Graph Generation
Figure 3 for Leveraging Predicate and Triplet Learning for Scene Graph Generation
Figure 4 for Leveraging Predicate and Triplet Learning for Scene Graph Generation
Viaarxiv icon

ReWiTe: Realistic Wide-angle and Telephoto Dual Camera Fusion Dataset via Beam Splitter Camera Rig

Add code
Apr 16, 2024
Figure 1 for ReWiTe: Realistic Wide-angle and Telephoto Dual Camera Fusion Dataset via Beam Splitter Camera Rig
Figure 2 for ReWiTe: Realistic Wide-angle and Telephoto Dual Camera Fusion Dataset via Beam Splitter Camera Rig
Figure 3 for ReWiTe: Realistic Wide-angle and Telephoto Dual Camera Fusion Dataset via Beam Splitter Camera Rig
Figure 4 for ReWiTe: Realistic Wide-angle and Telephoto Dual Camera Fusion Dataset via Beam Splitter Camera Rig
Viaarxiv icon

BMLP: Behavior-aware MLP for Heterogeneous Sequential Recommendation

Add code
Feb 20, 2024
Viaarxiv icon