Picture for Mike Zheng Shou

Mike Zheng Shou

DoraCycle: Domain-Oriented Adaptation of Unified Generative Model in Multimodal Cycles

Add code
Mar 05, 2025
Viaarxiv icon

Difix3D+: Improving 3D Reconstructions with Single-Step Diffusion Models

Add code
Mar 03, 2025
Viaarxiv icon

PhotoDoodle: Learning Artistic Image Editing from Few-Shot Pairwise Data

Add code
Feb 23, 2025
Viaarxiv icon

InterFeedback: Unveiling Interactive Intelligence of Large Multimodal Models via Human Feedback

Add code
Feb 20, 2025
Viaarxiv icon

PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning

Add code
Feb 17, 2025
Viaarxiv icon

WorldGUI: Dynamic Testing for Comprehensive Desktop GUI Automation

Add code
Feb 12, 2025
Viaarxiv icon

UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths

Add code
Feb 10, 2025
Figure 1 for UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Figure 2 for UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Figure 3 for UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Figure 4 for UniMoD: Efficient Unified Multimodal Transformers with Mixture-of-Depths
Viaarxiv icon

MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation

Add code
Feb 03, 2025
Figure 1 for MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation
Figure 2 for MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation
Figure 3 for MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation
Figure 4 for MakeAnything: Harnessing Diffusion Transformers for Multi-Domain Procedural Sequence Generation
Viaarxiv icon

LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer

Add code
Feb 03, 2025
Viaarxiv icon

DiffSim: Taming Diffusion Models for Evaluating Visual Similarity

Add code
Dec 19, 2024
Figure 1 for DiffSim: Taming Diffusion Models for Evaluating Visual Similarity
Figure 2 for DiffSim: Taming Diffusion Models for Evaluating Visual Similarity
Figure 3 for DiffSim: Taming Diffusion Models for Evaluating Visual Similarity
Figure 4 for DiffSim: Taming Diffusion Models for Evaluating Visual Similarity
Viaarxiv icon