Picture for Wenze Hu

Wenze Hu

UniVG: A Generalist Diffusion Model for Unified Image Generation and Editing

Add code
Mar 16, 2025
Viaarxiv icon

DiT-Air: Revisiting the Efficiency of Diffusion Model Architecture Design in Text to Image Generation

Add code
Mar 13, 2025
Viaarxiv icon

STIV: Scalable Text and Image Conditioned Video Generation

Add code
Dec 10, 2024
Viaarxiv icon

Revisit Large-Scale Image-Caption Data in Pre-training Multimodal Foundation Models

Add code
Oct 03, 2024
Viaarxiv icon

Guiding Instruction-based Image Editing via Multimodal Large Language Models

Add code
Sep 29, 2023
Viaarxiv icon

Million-scale Object Detection with Large Vision Model

Add code
Dec 19, 2022
Viaarxiv icon

NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction

Add code
Nov 15, 2022
Figure 1 for NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction
Figure 2 for NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction
Figure 3 for NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction
Figure 4 for NAR-Former: Neural Architecture Representation Learning towards Holistic Attributes Prediction
Viaarxiv icon

ParCNetV2: Oversized Kernel with Enhanced Attention

Add code
Nov 14, 2022
Figure 1 for ParCNetV2: Oversized Kernel with Enhanced Attention
Figure 2 for ParCNetV2: Oversized Kernel with Enhanced Attention
Figure 3 for ParCNetV2: Oversized Kernel with Enhanced Attention
Figure 4 for ParCNetV2: Oversized Kernel with Enhanced Attention
Viaarxiv icon

CabViT: Cross Attention among Blocks for Vision Transformer

Add code
Nov 14, 2022
Figure 1 for CabViT: Cross Attention among Blocks for Vision Transformer
Figure 2 for CabViT: Cross Attention among Blocks for Vision Transformer
Figure 3 for CabViT: Cross Attention among Blocks for Vision Transformer
Figure 4 for CabViT: Cross Attention among Blocks for Vision Transformer
Viaarxiv icon

Fast-ParC: Position Aware Global Kernel for ConvNets and ViTs

Add code
Oct 08, 2022
Figure 1 for Fast-ParC: Position Aware Global Kernel for ConvNets and ViTs
Figure 2 for Fast-ParC: Position Aware Global Kernel for ConvNets and ViTs
Figure 3 for Fast-ParC: Position Aware Global Kernel for ConvNets and ViTs
Figure 4 for Fast-ParC: Position Aware Global Kernel for ConvNets and ViTs
Viaarxiv icon