Picture for Zhiyao Duan

Zhiyao Duan

HARP 2.0: Expanding Hosted, Asynchronous, Remote Processing for Deep Learning in the DAW

Add code
Mar 04, 2025
Viaarxiv icon

Audio Visual Segmentation Through Text Embeddings

Add code
Feb 22, 2025
Viaarxiv icon

SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge

Add code
Aug 28, 2024
Viaarxiv icon

Generating Data with Text-to-Speech and Large-Language Models for Conversational Speech Recognition

Add code
Aug 17, 2024
Viaarxiv icon

A Multi-Stream Fusion Approach with One-Class Learning for Audio-Visual Deepfake Detection

Add code
Jun 20, 2024
Figure 1 for A Multi-Stream Fusion Approach with One-Class Learning for Audio-Visual Deepfake Detection
Figure 2 for A Multi-Stream Fusion Approach with One-Class Learning for Audio-Visual Deepfake Detection
Figure 3 for A Multi-Stream Fusion Approach with One-Class Learning for Audio-Visual Deepfake Detection
Figure 4 for A Multi-Stream Fusion Approach with One-Class Learning for Audio-Visual Deepfake Detection
Viaarxiv icon

Articulatory Phonetics Informed Controllable Expressive Speech Synthesis

Add code
Jun 15, 2024
Figure 1 for Articulatory Phonetics Informed Controllable Expressive Speech Synthesis
Figure 2 for Articulatory Phonetics Informed Controllable Expressive Speech Synthesis
Viaarxiv icon

GTR-Voice: Articulatory Phonetics Informed Controllable Expressive Speech Synthesis

Add code
Jun 15, 2024
Figure 1 for GTR-Voice: Articulatory Phonetics Informed Controllable Expressive Speech Synthesis
Figure 2 for GTR-Voice: Articulatory Phonetics Informed Controllable Expressive Speech Synthesis
Viaarxiv icon

CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection

Add code
Jun 04, 2024
Figure 1 for CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
Figure 2 for CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
Figure 3 for CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
Figure 4 for CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
Viaarxiv icon

SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan

Add code
May 08, 2024
Figure 1 for SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan
Figure 2 for SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan
Figure 3 for SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan
Figure 4 for SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan
Viaarxiv icon

Scoring Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription

Add code
Apr 17, 2024
Figure 1 for Scoring Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription
Figure 2 for Scoring Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription
Figure 3 for Scoring Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription
Figure 4 for Scoring Intervals using Non-Hierarchical Transformer For Automatic Piano Transcription
Viaarxiv icon