Picture for Roland Maas

Roland Maas

Two-pass Endpoint Detection for Speech Recognition

Add code
Jan 17, 2024
Viaarxiv icon

Cross-utterance ASR Rescoring with Graph-based Label Propagation

Add code
Mar 27, 2023
Viaarxiv icon

Leveraging Redundancy in Multiple Audio Signals for Far-Field Speech Recognition

Add code
Mar 01, 2023
Viaarxiv icon

Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation

Add code
Jul 16, 2022
Figure 1 for Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation
Figure 2 for Reducing Geographic Disparities in Automatic Speech Recognition via Elastic Weight Consolidation
Viaarxiv icon

VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition

Add code
Feb 22, 2022
Figure 1 for VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition
Figure 2 for VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition
Figure 3 for VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition
Figure 4 for VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition
Viaarxiv icon

Do You Listen with One or Two Microphones? A Unified ASR Model for Single and Multi-Channel Audio

Add code
Jun 28, 2021
Figure 1 for Do You Listen with One or Two Microphones? A Unified ASR Model for Single and Multi-Channel Audio
Figure 2 for Do You Listen with One or Two Microphones? A Unified ASR Model for Single and Multi-Channel Audio
Figure 3 for Do You Listen with One or Two Microphones? A Unified ASR Model for Single and Multi-Channel Audio
Figure 4 for Do You Listen with One or Two Microphones? A Unified ASR Model for Single and Multi-Channel Audio
Viaarxiv icon

SynthASR: Unlocking Synthetic Data for Speech Recognition

Add code
Jun 14, 2021
Figure 1 for SynthASR: Unlocking Synthetic Data for Speech Recognition
Figure 2 for SynthASR: Unlocking Synthetic Data for Speech Recognition
Figure 3 for SynthASR: Unlocking Synthetic Data for Speech Recognition
Figure 4 for SynthASR: Unlocking Synthetic Data for Speech Recognition
Viaarxiv icon

Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition

Add code
May 14, 2021
Figure 1 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Figure 2 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Figure 3 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Figure 4 for Attention-based Neural Beamforming Layers for Multi-channel Speech Recognition
Viaarxiv icon

Wav2vec-C: A Self-supervised Model for Speech Representation Learning

Add code
Mar 09, 2021
Figure 1 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Figure 2 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Figure 3 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Figure 4 for Wav2vec-C: A Self-supervised Model for Speech Representation Learning
Viaarxiv icon

REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling

Add code
Dec 14, 2020
Figure 1 for REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling
Figure 2 for REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling
Viaarxiv icon