Picture for Zhiyuan Ma

Zhiyuan Ma

Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach

Add code
Dec 04, 2024
Figure 1 for Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach
Figure 2 for Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach
Figure 3 for Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach
Figure 4 for Pixel-level and Semantic-level Adjustable Super-resolution: A Dual-LoRA Approach
Viaarxiv icon

VideoDirector: Precise Video Editing via Text-to-Video Models

Add code
Nov 26, 2024
Figure 1 for VideoDirector: Precise Video Editing via Text-to-Video Models
Figure 2 for VideoDirector: Precise Video Editing via Text-to-Video Models
Figure 3 for VideoDirector: Precise Video Editing via Text-to-Video Models
Figure 4 for VideoDirector: Precise Video Editing via Text-to-Video Models
Viaarxiv icon

MVBoost: Boost 3D Reconstruction with Multi-View Refinement

Add code
Nov 26, 2024
Viaarxiv icon

CMAL: A Novel Cross-Modal Associative Learning Framework for Vision-Language Pre-Training

Add code
Oct 16, 2024
Viaarxiv icon

DH-VTON: Deep Text-Driven Virtual Try-On via Hybrid Attention Learning

Add code
Oct 16, 2024
Viaarxiv icon

Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices

Add code
Oct 15, 2024
Figure 1 for Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices
Figure 2 for Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices
Figure 3 for Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices
Figure 4 for Efficient Diffusion Models: A Comprehensive Survey from Principles to Practices
Viaarxiv icon

Mirror-Consistency: Harnessing Inconsistency in Majority Voting

Add code
Oct 07, 2024
Viaarxiv icon

Safe-SD: Safe and Traceable Stable Diffusion with Text Prompt Trigger for Invisible Generative Watermarking

Add code
Jul 19, 2024
Viaarxiv icon

Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding

Add code
Jul 13, 2024
Figure 1 for Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
Figure 2 for Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
Figure 3 for Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
Figure 4 for Dense Multimodal Alignment for Open-Vocabulary 3D Scene Understanding
Viaarxiv icon

ScaleDreamer: Scalable Text-to-3D Synthesis with Asynchronous Score Distillation

Add code
Jul 02, 2024
Viaarxiv icon