Picture for Pengyuan Zhang

Pengyuan Zhang

Transliterated Zero-Shot Domain Adaptation for Automatic Speech Recognition

Add code
Dec 15, 2024
Viaarxiv icon

SF-Speech: Straightened Flow for Zero-Shot Voice Clone on Small-Scale Dataset

Add code
Oct 16, 2024
Figure 1 for SF-Speech: Straightened Flow for Zero-Shot Voice Clone on Small-Scale Dataset
Figure 2 for SF-Speech: Straightened Flow for Zero-Shot Voice Clone on Small-Scale Dataset
Figure 3 for SF-Speech: Straightened Flow for Zero-Shot Voice Clone on Small-Scale Dataset
Figure 4 for SF-Speech: Straightened Flow for Zero-Shot Voice Clone on Small-Scale Dataset
Viaarxiv icon

Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation

Add code
Jul 07, 2024
Figure 1 for Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
Figure 2 for Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
Figure 3 for Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
Figure 4 for Emilia: An Extensive, Multilingual, and Diverse Speech Dataset for Large-Scale Speech Generation
Viaarxiv icon

TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition

Add code
Apr 19, 2024
Figure 1 for TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition
Figure 2 for TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition
Figure 3 for TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition
Figure 4 for TRNet: Two-level Refinement Network leveraging Speech Enhancement for Noise Robust Speech Emotion Recognition
Viaarxiv icon

Modality-Collaborative Transformer with Hybrid Feature Reconstruction for Robust Emotion Recognition

Add code
Dec 26, 2023
Viaarxiv icon

DSNet: Disentangled Siamese Network with Neutral Calibration for Speech Emotion Recognition

Add code
Dec 25, 2023
Viaarxiv icon

Enhancing Spoofing Speech Detection Using Rhythm Information

Add code
Oct 18, 2023
Viaarxiv icon

Synthetic Speech Detection Based on Temporal Consistency and Distribution of Speaker Features

Add code
Sep 29, 2023
Viaarxiv icon

The Impact of Silence on Speech Anti-Spoofing

Add code
Sep 21, 2023
Viaarxiv icon

Improving Short Utterance Anti-Spoofing with AASIST2

Add code
Sep 15, 2023
Viaarxiv icon