Picture for Jiangning Zhang

Jiangning Zhang

Decouple and Track: Benchmarking and Improving Video Diffusion Transformers for Motion Transfer

Add code
Mar 21, 2025
Viaarxiv icon

Image Inversion: A Survey from GANs to Diffusion and Beyond

Add code
Feb 17, 2025
Viaarxiv icon

RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation

Add code
Jan 14, 2025
Figure 1 for RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
Figure 2 for RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
Figure 3 for RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
Figure 4 for RWKV-UNet: Improving UNet with Long-Range Cooperation for Effective Medical Image Segmentation
Viaarxiv icon

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

Add code
Jan 08, 2025
Viaarxiv icon

SVFR: A Unified Framework for Generalized Video Face Restoration

Add code
Jan 03, 2025
Figure 1 for SVFR: A Unified Framework for Generalized Video Face Restoration
Figure 2 for SVFR: A Unified Framework for Generalized Video Face Restoration
Figure 3 for SVFR: A Unified Framework for Generalized Video Face Restoration
Figure 4 for SVFR: A Unified Framework for Generalized Video Face Restoration
Viaarxiv icon

Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction

Add code
Jan 01, 2025
Figure 1 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Figure 2 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Figure 3 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Figure 4 for Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction
Viaarxiv icon

EMOv2: Pushing 5M Vision Model Frontier

Add code
Dec 09, 2024
Figure 1 for EMOv2: Pushing 5M Vision Model Frontier
Figure 2 for EMOv2: Pushing 5M Vision Model Frontier
Figure 3 for EMOv2: Pushing 5M Vision Model Frontier
Figure 4 for EMOv2: Pushing 5M Vision Model Frontier
Viaarxiv icon

Exploring Real&Synthetic Dataset and Linear Attention in Image Restoration

Add code
Dec 05, 2024
Viaarxiv icon

DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation

Add code
Dec 04, 2024
Figure 1 for DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Figure 2 for DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Figure 3 for DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Figure 4 for DynamicControl: Adaptive Condition Selection for Improved Text-to-Image Generation
Viaarxiv icon

Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing

Add code
Nov 26, 2024
Figure 1 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Figure 2 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Figure 3 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Figure 4 for Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing
Viaarxiv icon