Picture for Jiaming Liu

Jiaming Liu

RoboMIND 2.0: A Multimodal, Bimanual Mobile Manipulation Dataset for Generalizable Embodied Intelligence

Add code
Dec 31, 2025
Viaarxiv icon

Loom: Diffusion-Transformer for Interleaved Generation

Add code
Dec 20, 2025
Figure 1 for Loom: Diffusion-Transformer for Interleaved Generation
Figure 2 for Loom: Diffusion-Transformer for Interleaved Generation
Figure 3 for Loom: Diffusion-Transformer for Interleaved Generation
Figure 4 for Loom: Diffusion-Transformer for Interleaved Generation
Viaarxiv icon

GRACE: Designing Generative Face Video Codec via Agile Hardware-Centric Workflow

Add code
Nov 12, 2025
Figure 1 for GRACE: Designing Generative Face Video Codec via Agile Hardware-Centric Workflow
Figure 2 for GRACE: Designing Generative Face Video Codec via Agile Hardware-Centric Workflow
Figure 3 for GRACE: Designing Generative Face Video Codec via Agile Hardware-Centric Workflow
Figure 4 for GRACE: Designing Generative Face Video Codec via Agile Hardware-Centric Workflow
Viaarxiv icon

Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model

Add code
Oct 21, 2025
Figure 1 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 2 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 3 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Figure 4 for Every Step Evolves: Scaling Reinforcement Learning for Trillion-Scale Thinking Model
Viaarxiv icon

MLA: A Multisensory Language-Action Model for Multimodal Understanding and Forecasting in Robotic Manipulation

Add code
Sep 30, 2025
Figure 1 for MLA: A Multisensory Language-Action Model for Multimodal Understanding and Forecasting in Robotic Manipulation
Figure 2 for MLA: A Multisensory Language-Action Model for Multimodal Understanding and Forecasting in Robotic Manipulation
Figure 3 for MLA: A Multisensory Language-Action Model for Multimodal Understanding and Forecasting in Robotic Manipulation
Figure 4 for MLA: A Multisensory Language-Action Model for Multimodal Understanding and Forecasting in Robotic Manipulation
Viaarxiv icon

WoW: Towards a World omniscient World model Through Embodied Interaction

Add code
Sep 26, 2025
Viaarxiv icon

BEVUDA++: Geometric-aware Unsupervised Domain Adaptation for Multi-View 3D Object Detection

Add code
Sep 17, 2025
Viaarxiv icon

Awesome-OL: An Extensible Toolkit for Online Learning

Add code
Jul 27, 2025
Viaarxiv icon

Stable-Hair v2: Real-World Hair Transfer via Multiple-View Diffusion Model

Add code
Jul 10, 2025
Figure 1 for Stable-Hair v2: Real-World Hair Transfer via Multiple-View Diffusion Model
Figure 2 for Stable-Hair v2: Real-World Hair Transfer via Multiple-View Diffusion Model
Figure 3 for Stable-Hair v2: Real-World Hair Transfer via Multiple-View Diffusion Model
Figure 4 for Stable-Hair v2: Real-World Hair Transfer via Multiple-View Diffusion Model
Viaarxiv icon

AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation

Add code
Jul 02, 2025
Figure 1 for AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
Figure 2 for AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
Figure 3 for AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
Figure 4 for AC-DiT: Adaptive Coordination Diffusion Transformer for Mobile Manipulation
Viaarxiv icon