Picture for Zhengkai Jiang

Zhengkai Jiang

RealVVT: Towards Photorealistic Video Virtual Try-on via Spatio-Temporal Consistency

Add code
Jan 15, 2025
Viaarxiv icon

EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation

Add code
Jan 03, 2025
Figure 1 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 2 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 3 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Figure 4 for EnerVerse: Envisioning Embodied Future Space for Robotics Manipulation
Viaarxiv icon

Foundation Cures Personalization: Recovering Facial Personalized Models' Prompt Consistency

Add code
Nov 22, 2024
Viaarxiv icon

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

Add code
Nov 15, 2024
Viaarxiv icon

UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models

Add code
Sep 30, 2024
Figure 1 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 2 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 3 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 4 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Viaarxiv icon

SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation

Add code
Sep 26, 2024
Viaarxiv icon

OSV: One Step is Enough for High-Quality Image to Video Generation

Add code
Sep 17, 2024
Figure 1 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 2 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 3 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 4 for OSV: One Step is Enough for High-Quality Image to Video Generation
Viaarxiv icon

Adapted-MoE: Mixture of Experts with Test-Time Adaption for Anomaly Detection

Add code
Sep 09, 2024
Viaarxiv icon

Temporal and Interactive Modeling for Efficient Human-Human Motion Generation

Add code
Aug 30, 2024
Figure 1 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 2 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 3 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 4 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Viaarxiv icon

MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation

Add code
Aug 06, 2024
Figure 1 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 2 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 3 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 4 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Viaarxiv icon