Picture for Piotr Zelasko

Piotr Zelasko

NeKo: Toward Post Recognition Generative Correction Large Language Models with Task-Oriented Experts

Add code
Nov 08, 2024
Viaarxiv icon

VoiceTextBlender: Augmenting Large Language Models with Speech Capabilities via Single-Stage Joint Speech-Text Supervised Fine-Tuning

Add code
Oct 23, 2024
Viaarxiv icon

Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation

Add code
Oct 31, 2022
Figure 1 for Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Figure 2 for Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Figure 3 for Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Figure 4 for Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation
Viaarxiv icon

Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser

Add code
Apr 08, 2022
Figure 1 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Figure 2 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Figure 3 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Figure 4 for Defense against Adversarial Attacks on Hybrid Speech Recognition using Joint Adversarial Fine-tuning with Denoiser
Viaarxiv icon

Earnings-21: A Practical Benchmark for ASR in the Wild

Add code
Apr 28, 2021
Figure 1 for Earnings-21: A Practical Benchmark for ASR in the Wild
Figure 2 for Earnings-21: A Practical Benchmark for ASR in the Wild
Figure 3 for Earnings-21: A Practical Benchmark for ASR in the Wild
Figure 4 for Earnings-21: A Practical Benchmark for ASR in the Wild
Viaarxiv icon

Learning Speaker Embedding from Text-to-Speech

Add code
Oct 21, 2020
Figure 1 for Learning Speaker Embedding from Text-to-Speech
Figure 2 for Learning Speaker Embedding from Text-to-Speech
Figure 3 for Learning Speaker Embedding from Text-to-Speech
Figure 4 for Learning Speaker Embedding from Text-to-Speech
Viaarxiv icon

Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings?

Add code
Apr 13, 2020
Figure 1 for Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings?
Figure 2 for Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings?
Figure 3 for Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings?
Figure 4 for Punctuation Prediction in Spontaneous Conversations: Can We Mitigate ASR Errors with Retrofitted Word Embeddings?
Viaarxiv icon