Picture for Jun Du

Jun Du

Col-OLHTR: A Novel Framework for Multimodal Online Handwritten Text Recognition

Add code
Feb 10, 2025
Viaarxiv icon

Latent Swap Joint Diffusion for Long-Form Audio Generation

Add code
Feb 07, 2025
Figure 1 for Latent Swap Joint Diffusion for Long-Form Audio Generation
Figure 2 for Latent Swap Joint Diffusion for Long-Form Audio Generation
Figure 3 for Latent Swap Joint Diffusion for Long-Form Audio Generation
Figure 4 for Latent Swap Joint Diffusion for Long-Form Audio Generation
Viaarxiv icon

PaMMA-Net: Plasmas magnetic measurement evolution based on data-driven incremental accumulative prediction

Add code
Jan 23, 2025
Viaarxiv icon

Skeleton and Font Generation Network for Zero-shot Chinese Character Generation

Add code
Jan 14, 2025
Viaarxiv icon

Learned Data Compression: Challenges and Opportunities for the Future

Add code
Dec 14, 2024
Viaarxiv icon

RFL: Simplifying Chemical Structure Recognition with Ring-Free Language

Add code
Dec 10, 2024
Figure 1 for RFL: Simplifying Chemical Structure Recognition with Ring-Free Language
Figure 2 for RFL: Simplifying Chemical Structure Recognition with Ring-Free Language
Figure 3 for RFL: Simplifying Chemical Structure Recognition with Ring-Free Language
Figure 4 for RFL: Simplifying Chemical Structure Recognition with Ring-Free Language
Viaarxiv icon

Joint Optimization of Communication Enhancement and Location Privacy Protection in RIS-Assisted Underwater Communication System

Add code
Nov 30, 2024
Viaarxiv icon

EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion

Add code
Nov 23, 2024
Figure 1 for EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion
Figure 2 for EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion
Figure 3 for EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion
Figure 4 for EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion
Viaarxiv icon

MVANet: Multi-Stage Video Attention Network for Sound Event Localization and Detection with Source Distance Estimation

Add code
Nov 21, 2024
Viaarxiv icon

DCF-DS: Deep Cascade Fusion of Diarization and Separation for Speech Recognition under Realistic Single-Channel Conditions

Add code
Nov 11, 2024
Figure 1 for DCF-DS: Deep Cascade Fusion of Diarization and Separation for Speech Recognition under Realistic Single-Channel Conditions
Figure 2 for DCF-DS: Deep Cascade Fusion of Diarization and Separation for Speech Recognition under Realistic Single-Channel Conditions
Figure 3 for DCF-DS: Deep Cascade Fusion of Diarization and Separation for Speech Recognition under Realistic Single-Channel Conditions
Figure 4 for DCF-DS: Deep Cascade Fusion of Diarization and Separation for Speech Recognition under Realistic Single-Channel Conditions
Viaarxiv icon