Picture for Piotr Żelasko

Piotr Żelasko

Chain-of-Thought Prompting for Speech Translation

Add code
Sep 17, 2024
Figure 1 for Chain-of-Thought Prompting for Speech Translation
Figure 2 for Chain-of-Thought Prompting for Speech Translation
Figure 3 for Chain-of-Thought Prompting for Speech Translation
Figure 4 for Chain-of-Thought Prompting for Speech Translation
Viaarxiv icon

Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition

Add code
Sep 17, 2024
Figure 1 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 2 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 3 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Figure 4 for Large Language Model Based Generative Error Correction: A Challenge and Baselines for Speech Recognition, Speaker Tagging, and Emotion Recognition
Viaarxiv icon

BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5

Add code
Jun 28, 2024
Figure 1 for BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Figure 2 for BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Figure 3 for BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Figure 4 for BESTOW: Efficient and Streamable Speech Language Model with the Best of Two Worlds in GPT and T5
Viaarxiv icon

Less is More: Accurate Speech Recognition & Translation without Web-Scale Data

Add code
Jun 28, 2024
Viaarxiv icon

Regularizing Contrastive Predictive Coding for Speech Applications

Add code
Apr 26, 2023
Figure 1 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 2 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 3 for Regularizing Contrastive Predictive Coding for Speech Applications
Figure 4 for Regularizing Contrastive Predictive Coding for Speech Applications
Viaarxiv icon

Fast and parallel decoding for transducer

Add code
Oct 31, 2022
Viaarxiv icon

Delay-penalized transducer for low-latency streaming ASR

Add code
Oct 31, 2022
Viaarxiv icon

Time-domain speech super-resolution with GAN based modeling for telephony speaker verification

Add code
Sep 04, 2022
Figure 1 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Figure 2 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Figure 3 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Figure 4 for Time-domain speech super-resolution with GAN based modeling for telephony speaker verification
Viaarxiv icon

Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations

Add code
Aug 10, 2022
Figure 1 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 2 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 3 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Figure 4 for Non-Contrastive Self-Supervised Learning of Utterance-Level Speech Representations
Viaarxiv icon

Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition

Add code
Jan 28, 2022
Figure 1 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 2 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 3 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Figure 4 for Discovering Phonetic Inventories with Crosslingual Automatic Speech Recognition
Viaarxiv icon