Picture for Zhengkai Jiang

Zhengkai Jiang

UniAff: A Unified Representation of Affordances for Tool Usage and Articulation with Vision-Language Models

Add code
Sep 30, 2024
Viaarxiv icon

SKT: Integrating State-Aware Keypoint Trajectories with Vision-Language Models for Robotic Garment Manipulation

Add code
Sep 26, 2024
Viaarxiv icon

OSV: One Step is Enough for High-Quality Image to Video Generation

Add code
Sep 17, 2024
Figure 1 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 2 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 3 for OSV: One Step is Enough for High-Quality Image to Video Generation
Figure 4 for OSV: One Step is Enough for High-Quality Image to Video Generation
Viaarxiv icon

Adapted-MoE: Mixture of Experts with Test-Time Adaption for Anomaly Detection

Add code
Sep 09, 2024
Viaarxiv icon

Temporal and Interactive Modeling for Efficient Human-Human Motion Generation

Add code
Aug 30, 2024
Figure 1 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 2 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 3 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Figure 4 for Temporal and Interactive Modeling for Efficient Human-Human Motion Generation
Viaarxiv icon

MDT-A2G: Exploring Masked Diffusion Transformers for Co-Speech Gesture Generation

Add code
Aug 06, 2024
Viaarxiv icon

NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models

Add code
May 31, 2024
Viaarxiv icon

VividPose: Advancing Stable Video Diffusion for Realistic Human Image Animation

Add code
May 28, 2024
Viaarxiv icon

AdapNet: Adaptive Noise-Based Network for Low-Quality Image Retrieval

Add code
May 28, 2024
Viaarxiv icon

Efficient Multimodal Large Language Models: A Survey

Add code
May 17, 2024
Viaarxiv icon