Picture for Ao Ma

Ao Ma

HiCo: Hierarchical Controllable Diffusion Model for Layout-to-image Generation

Add code
Oct 18, 2024
Viaarxiv icon

Qihoo-T2X: An Efficiency-Focused Diffusion Transformer via Proxy Tokens for Text-to-Any-Task

Add code
Sep 06, 2024
Viaarxiv icon

FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance

Add code
Aug 15, 2024
Figure 1 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Figure 2 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Figure 3 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Figure 4 for FancyVideo: Towards Dynamic and Consistent Video Generation via Cross-frame Textual Guidance
Viaarxiv icon

Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images

Add code
Aug 15, 2024
Figure 1 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Figure 2 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Figure 3 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Figure 4 for Multimodal Causal Reasoning Benchmark: Challenging Vision Large Language Models to Infer Causal Links Between Siamese Images
Viaarxiv icon

MDD-UNet: Domain Adaptation for Medical Image Segmentation with Theoretical Guarantees, a Proof of Concept

Add code
Dec 19, 2023
Viaarxiv icon

Res-Tuning: A Flexible and Efficient Tuning Paradigm via Unbinding Tuner from Backbone

Add code
Oct 30, 2023
Viaarxiv icon