Picture for Jun He

Jun He

ByteDance

UrbanFeel: A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective

Add code
Sep 26, 2025
Figure 1 for UrbanFeel: A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective
Figure 2 for UrbanFeel: A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective
Figure 3 for UrbanFeel: A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective
Figure 4 for UrbanFeel: A Comprehensive Benchmark for Temporal and Perceptual Understanding of City Scenes through Human Perspective
Viaarxiv icon

Echo-4o: Harnessing the Power of GPT-4o Synthetic Images for Improved Image Generation

Add code
Aug 13, 2025
Viaarxiv icon

Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations

Add code
Jul 16, 2025
Figure 1 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 2 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 3 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Figure 4 for Quantize More, Lose Less: Autoregressive Generation from Residually Quantized Speech Representations
Viaarxiv icon

Estimate Hitting Time by Hitting Probability for Elitist Evolutionary Algorithms

Add code
Jun 18, 2025
Viaarxiv icon

SyncTalk++: High-Fidelity and Efficient Synchronized Talking Heads Synthesis Using Gaussian Splatting

Add code
Jun 17, 2025
Figure 1 for SyncTalk++: High-Fidelity and Efficient Synchronized Talking Heads Synthesis Using Gaussian Splatting
Figure 2 for SyncTalk++: High-Fidelity and Efficient Synchronized Talking Heads Synthesis Using Gaussian Splatting
Figure 3 for SyncTalk++: High-Fidelity and Efficient Synchronized Talking Heads Synthesis Using Gaussian Splatting
Figure 4 for SyncTalk++: High-Fidelity and Efficient Synchronized Talking Heads Synthesis Using Gaussian Splatting
Viaarxiv icon

OmniSync: Towards Universal Lip Synchronization via Diffusion Transformers

Add code
May 27, 2025
Viaarxiv icon

DualTalk: Dual-Speaker Interaction for 3D Talking Head Conversations

Add code
May 26, 2025
Viaarxiv icon

MEGADance: Mixture-of-Experts Architecture for Genre-Aware 3D Dance Generation

Add code
May 23, 2025
Viaarxiv icon

MatchDance: Collaborative Mamba-Transformer Architecture Matching for High-Quality 3D Dance Synthesis

Add code
May 21, 2025
Viaarxiv icon

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

Add code
Apr 03, 2025
Figure 1 for GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Figure 2 for GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Figure 3 for GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Figure 4 for GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation
Viaarxiv icon