Picture for Ruijie Tao

Ruijie Tao

Unified Audio Event Detection

Add code
Sep 13, 2024
Figure 1 for Unified Audio Event Detection
Figure 2 for Unified Audio Event Detection
Figure 3 for Unified Audio Event Detection
Figure 4 for Unified Audio Event Detection
Viaarxiv icon

Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization

Add code
Jul 25, 2024
Figure 1 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Figure 2 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Figure 3 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Figure 4 for Multi-Stage Face-Voice Association Learning with Keynote Speaker Diarization
Viaarxiv icon

A Benchmark for Multi-speaker Anonymization

Add code
Jul 08, 2024
Viaarxiv icon

Temporal-Channel Modeling in Multi-head Self-Attention for Synthetic Speech Detection

Add code
Jun 25, 2024
Viaarxiv icon

Target Speech Diarization with Multimodal Prompts

Add code
Jun 11, 2024
Viaarxiv icon

How Do Neural Spoofing Countermeasures Detect Partially Spoofed Audio?

Add code
Jun 04, 2024
Viaarxiv icon

Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention

Add code
Apr 29, 2024
Figure 1 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Figure 2 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Figure 3 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Figure 4 for Audio-Visual Target Speaker Extraction with Reverse Selective Auditory Attention
Viaarxiv icon

Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training

Add code
Apr 01, 2024
Figure 1 for Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training
Figure 2 for Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training
Figure 3 for Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training
Figure 4 for Enhancing Real-World Active Speaker Detection with Multi-Modal Extraction Pre-Training
Viaarxiv icon

Voice Conversion Augmentation for Speaker Recognition on Defective Datasets

Add code
Apr 01, 2024
Viaarxiv icon

Prompt-driven Target Speech Diarization

Add code
Oct 23, 2023
Viaarxiv icon