Picture for Mike Seltzer

Mike Seltzer

Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation

Add code
Oct 27, 2024
Viaarxiv icon

Effective internal language model training and fusion for factorized transducer model

Add code
Apr 02, 2024
Viaarxiv icon

Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data

Add code
Nov 12, 2023
Figure 1 for Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data
Figure 2 for Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data
Figure 3 for Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data
Figure 4 for Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data
Viaarxiv icon

TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

Add code
Sep 05, 2023
Figure 1 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 2 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 3 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Figure 4 for TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Viaarxiv icon

Prompting Large Language Models with Speech Recognition Abilities

Add code
Jul 21, 2023
Figure 1 for Prompting Large Language Models with Speech Recognition Abilities
Figure 2 for Prompting Large Language Models with Speech Recognition Abilities
Figure 3 for Prompting Large Language Models with Speech Recognition Abilities
Figure 4 for Prompting Large Language Models with Speech Recognition Abilities
Viaarxiv icon

Multi-Head State Space Model for Speech Recognition

Add code
May 25, 2023
Figure 1 for Multi-Head State Space Model for Speech Recognition
Figure 2 for Multi-Head State Space Model for Speech Recognition
Figure 3 for Multi-Head State Space Model for Speech Recognition
Figure 4 for Multi-Head State Space Model for Speech Recognition
Viaarxiv icon

Dynamic Speech Endpoint Detection with Regression Targets

Add code
Oct 25, 2022
Viaarxiv icon

Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution

Add code
Oct 07, 2021
Figure 1 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 2 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 3 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Figure 4 for Streaming Transformer Transducer Based Speech Recognition Using Non-Causal Convolution
Viaarxiv icon

Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study

Add code
Oct 07, 2021
Figure 1 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Figure 2 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Figure 3 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Figure 4 for Transferring Voice Knowledge for Acoustic Event Detection: An Empirical Study
Viaarxiv icon

On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models

Add code
Jul 09, 2021
Figure 1 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 2 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 3 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Figure 4 for On lattice-free boosted MMI training of HMM and CTC-based full-context ASR models
Viaarxiv icon