Picture for John R. Hershey

John R. Hershey

Understanding Learning with Sliced-Wasserstein Requires Rethinking Informative Slices

Add code
Nov 16, 2024
Viaarxiv icon

Towards sub-millisecond latency real-time speech enhancement models on hearables

Add code
Sep 26, 2024
Figure 1 for Towards sub-millisecond latency real-time speech enhancement models on hearables
Figure 2 for Towards sub-millisecond latency real-time speech enhancement models on hearables
Figure 3 for Towards sub-millisecond latency real-time speech enhancement models on hearables
Figure 4 for Towards sub-millisecond latency real-time speech enhancement models on hearables
Viaarxiv icon

Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language

Add code
Jun 09, 2024
Figure 1 for Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language
Figure 2 for Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language
Figure 3 for Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language
Figure 4 for Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language
Viaarxiv icon

Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge

Add code
Feb 02, 2024
Viaarxiv icon

TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition

Add code
Aug 21, 2023
Figure 1 for TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition
Figure 2 for TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition
Figure 3 for TokenSplit: Using Discrete Speech Representations for Direct, Refined, and Transcript-Conditioned Speech Separation and Recognition
Viaarxiv icon

The CHiME-7 UDASE task: Unsupervised domain adaptation for conversational speech enhancement

Add code
Jul 07, 2023
Viaarxiv icon

Unsupervised Multi-channel Separation and Adaptation

Add code
May 18, 2023
Viaarxiv icon

AudioSlots: A slot-centric generative model for audio separation

Add code
May 09, 2023
Viaarxiv icon

AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation

Add code
Jul 20, 2022
Figure 1 for AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Figure 2 for AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Figure 3 for AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Figure 4 for AudioScopeV2: Audio-Visual Attention Architectures for Calibrated Open-Domain On-Screen Sound Separation
Viaarxiv icon

Distance-Based Sound Separation

Add code
Jul 01, 2022
Figure 1 for Distance-Based Sound Separation
Figure 2 for Distance-Based Sound Separation
Figure 3 for Distance-Based Sound Separation
Figure 4 for Distance-Based Sound Separation
Viaarxiv icon