Picture for Yaohui Wang

Yaohui Wang

TimeStep Master: Asymmetrical Mixture of Timestep LoRA Experts for Versatile and Efficient Diffusion Models in Vision

Add code
Mar 10, 2025
Viaarxiv icon

An Egocentric Vision-Language Model based Portable Real-time Smart Assistant

Add code
Mar 06, 2025
Viaarxiv icon

Dimitra: Audio-driven Diffusion model for Expressive Talking Head Generation

Add code
Feb 24, 2025
Viaarxiv icon

Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models

Add code
Jan 14, 2025
Figure 1 for Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Figure 2 for Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Figure 3 for Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Figure 4 for Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models
Viaarxiv icon

Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model

Add code
Dec 30, 2024
Figure 1 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Figure 2 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Figure 3 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Figure 4 for Vinci: A Real-time Embodied Smart Assistant based on Egocentric Vision-Language Model
Viaarxiv icon

DeepSeek-V3 Technical Report

Add code
Dec 27, 2024
Figure 1 for DeepSeek-V3 Technical Report
Figure 2 for DeepSeek-V3 Technical Report
Figure 3 for DeepSeek-V3 Technical Report
Figure 4 for DeepSeek-V3 Technical Report
Viaarxiv icon

MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost

Add code
Dec 02, 2024
Figure 1 for MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost
Figure 2 for MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost
Figure 3 for MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost
Figure 4 for MuLan: Adapting Multilingual Diffusion Models for Hundreds of Languages with Negligible Cost
Viaarxiv icon

VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models

Add code
Nov 20, 2024
Figure 1 for VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
Figure 2 for VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
Figure 3 for VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
Figure 4 for VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
Viaarxiv icon

Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning

Add code
Aug 26, 2024
Figure 1 for Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
Figure 2 for Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
Figure 3 for Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
Figure 4 for Fire-Flyer AI-HPC: A Cost-Effective Software-Hardware Co-Design for Deep Learning
Viaarxiv icon

SAM2-PATH: A better segment anything model for semantic segmentation in digital pathology

Add code
Aug 07, 2024
Viaarxiv icon