Picture for Xiong Xiao

Xiong Xiao

Entire Chain Uplift Modeling with Context-Enhanced Learning for Intelligent Marketing

Add code
Feb 04, 2024
Viaarxiv icon

NOTSOFAR-1 Challenge: New Datasets, Baseline, and Tasks for Distant Meeting Transcription

Add code
Jan 16, 2024
Viaarxiv icon

Profile-Error-Tolerant Target-Speaker Voice Activity Detection

Add code
Sep 21, 2023
Viaarxiv icon

A robust method for reliability updating with equality information using sequential adaptive importance sampling

Add code
Mar 08, 2023
Viaarxiv icon

Speaker Change Detection for Transformer Transducer ASR

Add code
Feb 16, 2023
Viaarxiv icon

Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization

Add code
Aug 27, 2022
Figure 1 for Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization
Figure 2 for Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization
Figure 3 for Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization
Figure 4 for Target Speaker Voice Activity Detection with Transformers and Its Integration with End-to-End Neural Diarization
Viaarxiv icon

Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings

Add code
Mar 30, 2022
Figure 1 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 2 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 3 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Figure 4 for Streaming Speaker-Attributed ASR with Token-Level Speaker Embeddings
Viaarxiv icon

Streaming Multi-Talker ASR with Token-Level Serialized Output Training

Add code
Feb 05, 2022
Figure 1 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 2 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 3 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Figure 4 for Streaming Multi-Talker ASR with Token-Level Serialized Output Training
Viaarxiv icon

Separating Long-Form Speech with Group-Wise Permutation Invariant Training

Add code
Nov 17, 2021
Figure 1 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 2 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 3 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Figure 4 for Separating Long-Form Speech with Group-Wise Permutation Invariant Training
Viaarxiv icon

WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing

Add code
Oct 29, 2021
Figure 1 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Figure 2 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Figure 3 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Figure 4 for WavLM: Large-Scale Self-Supervised Pre-Training for Full Stack Speech Processing
Viaarxiv icon