Picture for Yong Ren

Yong Ren

WMCodec: End-to-End Neural Speech Codec with Deep Watermarking for Authenticity Verification

Add code
Sep 18, 2024
Viaarxiv icon

Towards Diverse and Efficient Audio Captioning via Diffusion Models

Add code
Sep 14, 2024
Viaarxiv icon

STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment

Add code
Sep 13, 2024
Figure 1 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Figure 2 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Figure 3 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Figure 4 for STA-V2A: Video-to-Audio Generation with Semantic and Temporal Alignment
Viaarxiv icon

Utilizing Speaker Profiles for Impersonation Audio Detection

Add code
Aug 30, 2024
Viaarxiv icon

ADD 2023: Towards Audio Deepfake Detection and Analysis in the Wild

Add code
Aug 09, 2024
Viaarxiv icon

An Unsupervised Domain Adaptation Method for Locating Manipulated Region in partially fake Audio

Add code
Jul 11, 2024
Viaarxiv icon

Video-to-Audio Generation with Hidden Alignment

Add code
Jul 10, 2024
Figure 1 for Video-to-Audio Generation with Hidden Alignment
Figure 2 for Video-to-Audio Generation with Hidden Alignment
Figure 3 for Video-to-Audio Generation with Hidden Alignment
Figure 4 for Video-to-Audio Generation with Hidden Alignment
Viaarxiv icon

TraceableSpeech: Towards Proactively Traceable Text-to-Speech with Watermarking

Add code
Jun 07, 2024
Viaarxiv icon

Controllable Residual Speaker Representation for Voice Conversion

Add code
Sep 15, 2023
Viaarxiv icon

ADD 2023: the Second Audio Deepfake Detection Challenge

Add code
May 23, 2023
Viaarxiv icon