Picture for Andrew Rosenberg

Andrew Rosenberg

Zero-shot Cross-lingual Voice Transfer for TTS

Add code
Sep 20, 2024
Figure 1 for Zero-shot Cross-lingual Voice Transfer for TTS
Figure 2 for Zero-shot Cross-lingual Voice Transfer for TTS
Viaarxiv icon

STAB: Speech Tokenizer Assessment Benchmark

Add code
Sep 04, 2024
Viaarxiv icon

Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting

Add code
Aug 20, 2024
Figure 1 for Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
Figure 2 for Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
Figure 3 for Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
Figure 4 for Adversarial training of Keyword Spotting to Minimize TTS Data Overfitting
Viaarxiv icon

Utilizing TTS Synthesized Data for Efficient Development of Keyword Spotting Model

Add code
Jul 26, 2024
Viaarxiv icon

Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models

Add code
Jul 05, 2024
Viaarxiv icon

Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions

Add code
Jun 20, 2024
Figure 1 for Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions
Figure 2 for Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions
Figure 3 for Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions
Figure 4 for Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions
Viaarxiv icon

ASTRA: Aligning Speech and Text Representations for Asr without Sampling

Add code
Jun 10, 2024
Figure 1 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Figure 2 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Figure 3 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Figure 4 for ASTRA: Aligning Speech and Text Representations for Asr without Sampling
Viaarxiv icon

Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data

Add code
Feb 29, 2024
Figure 1 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 2 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 3 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Figure 4 for Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data
Viaarxiv icon

High-precision Voice Search Query Correction via Retrievable Speech-text Embedings

Add code
Jan 08, 2024
Viaarxiv icon

O-1: Self-training with Oracle and 1-best Hypothesis

Add code
Aug 14, 2023
Viaarxiv icon