Picture for Piotr Żelasko

Piotr Żelasko

Chain-of-Thought Prompting for Speech Translation

Add code
Sep 17, 2024
Figure 1 for Chain-of-Thought Prompting for Speech Translation
Figure 2 for Chain-of-Thought Prompting for Speech Translation
Figure 3 for Chain-of-Thought Prompting for Speech Translation
Figure 4 for Chain-of-Thought Prompting for Speech Translation
Viaarxiv icon

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Add code
Sep 17, 2024
Figure 1 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 2 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 3 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 4 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Viaarxiv icon

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data

Add code
Jun 28, 2024
Viaarxiv icon

BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5

Add code
Jun 28, 2024
Figure 1 for BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Figure 2 for BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Figure 3 for BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Figure 4 for BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Viaarxiv icon

Regularizing Contrastive Predictive Coding for Speech Applications

Add code
Apr 26, 2023
Figure 1 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 2 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 3 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 4 for Regularizing Contrastive Predictive Coding for Speech Applications
Viaarxiv icon

Fast and parallel decoding for transducer

Add code
Oct 31, 2022
Viaarxiv icon

Delay-penalized transducer for low-latency streaming ASR

Add code
Oct 31, 2022
Viaarxiv icon

Time-domain speech super-resolution with GAN based modeling for telephony speaker verification

Add code
Sep 04, 2022
Figure 1 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Figure 2 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Figure 3 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Figure 4 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Viaarxiv icon

Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations

Add code
Aug 10, 2022
Figure 1 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 2 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 3 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 4 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Viaarxiv icon

Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition

Add code
Jan 28, 2022
Figure 1 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 2 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 3 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 4 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Viaarxiv icon