Picture for Jinzuomu Zhong

Jinzuomu Zhong

Rethinking Discrete Speech Representation Tokens for Accent Generation

Add code
Jan 27, 2026
Viaarxiv icon

Pairwise Evaluation of Accent Similarity in Speech Synthesis

Add code
May 20, 2025
Viaarxiv icon

AccentBox: Towards High-Fidelity Zero-Shot Accent Generation

Add code
Sep 13, 2024
Figure 1 for AccentBox: Towards High-Fidelity Zero-Shot Accent Generation
Figure 2 for AccentBox: Towards High-Fidelity Zero-Shot Accent Generation
Figure 3 for AccentBox: Towards High-Fidelity Zero-Shot Accent Generation
Figure 4 for AccentBox: Towards High-Fidelity Zero-Shot Accent Generation
Viaarxiv icon

Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling

Add code
Apr 14, 2024
Figure 1 for Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling
Figure 2 for Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling
Figure 3 for Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling
Figure 4 for Prior-agnostic Multi-scale Contrastive Text-Audio Pre-training for Parallelized TTS Frontend Modeling
Viaarxiv icon

Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP

Add code
Sep 11, 2023
Figure 1 for Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP
Figure 2 for Multi-Modal Automatic Prosody Annotation with Contrastive Pretraining of SSWP
Viaarxiv icon