Picture for Gary Wang

Gary Wang

Zero-shot Cross-lingual Voice Transfer for TTS

Add code
Sep 20, 2024
Figure 1 for Zero-shot Cross-lingual Voice Transfer for TTS
Figure 2 for Zero-shot Cross-lingual Voice Transfer for TTS
Viaarxiv icon

Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting

Add code
Aug 20, 2024
Figure 1 for Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
Figure 2 for Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
Figure 3 for Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
Figure 4 for Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
Viaarxiv icon

Utilizing TTS Synthesized Data for Efficient Development of Keyword Spotting Model

Add code
Jul 26, 2024
Viaarxiv icon

ASTRA: Aligning Speech and Text Representations for Asr without Sampling

Add code
Jun 10, 2024
Figure 1 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Figure 2 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Figure 3 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Figure 4 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Viaarxiv icon

Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data

Add code
Feb 29, 2024
Figure 1 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 2 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 3 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 4 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Viaarxiv icon

High-precision Voice Search Query Correction via Retrievable Speech-text Embedings

Add code
Jan 08, 2024
Viaarxiv icon

Using Text Injection to Improve Recognition of Personal Identifiers in Speech

Add code
Aug 14, 2023
Figure 1 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Figure 2 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Figure 3 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Figure 4 for Using Text Injection to Improve Recognition of Personal Identifiers in Speech
Viaarxiv icon

Understanding Shared Speech-Text Representations

Add code
Apr 27, 2023
Viaarxiv icon

Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages

Add code
Mar 03, 2023
Viaarxiv icon

Modular Hybrid Autoregressive Transducer

Add code
Oct 31, 2022
Viaarxiv icon