Picture for Zhengkai Jiang

Zhengkai Jiang

Foundation Cures Personalization: Recovering Facial Personalized Models' Prompt Consistency

Add code
Nov 22, 2024
Viaarxiv icon

Region-Aware Text-to-Image Generation via Hard Binding and Soft Refinement

Add code
Nov 15, 2024
Viaarxiv icon

UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models

Add code
Sep 30, 2024
Figure 1 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 2 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 3 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Figure 4 for UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models
Viaarxiv icon

SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation

Add code
Sep 26, 2024
Viaarxiv icon

OSV: One Step is Enough for High-Quality Image to Video Generation

Add code
Sep 17, 2024
Figure 1 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 2 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 3 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 4 for OSV: One Step is Enough for High-Quality Image to Video Generation
Viaarxiv icon

Adapted-MoE: Mixture of Experts with Test-Time Adaption for Anomaly Detection

Add code
Sep 09, 2024
Viaarxiv icon

Temporal and Interactive Modeling for Efficient Human-Human Motion Generation

Add code
Aug 30, 2024
Figure 1 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 2 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 3 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 4 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Viaarxiv icon

MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation

Add code
Aug 06, 2024
Figure 1 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 2 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 3 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Figure 4 for MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation
Viaarxiv icon

NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models

Add code
May 31, 2024
Figure 1 for NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
Figure 2 for NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
Figure 3 for NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
Figure 4 for NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models
Viaarxiv icon

VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation

Add code
May 28, 2024
Figure 1 for VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation
Figure 2 for VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation
Figure 3 for VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation
Figure 4 for VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation
Viaarxiv icon