Picture for Sachin Kajarekar

Sachin Kajarekar

CALM: Contrastive Aligned Audio-Language Multirate and Multimodal Representations

Add code
Feb 08, 2022
Viaarxiv icon

Streaming on-device detection of device directed speech from voice and touch-based invocation

Add code
Oct 09, 2021
Figure 1 for Streaming on-device detection of device directed speech from voice and touch-based invocation
Figure 2 for Streaming on-device detection of device directed speech from voice and touch-based invocation
Figure 3 for Streaming on-device detection of device directed speech from voice and touch-based invocation
Figure 4 for Streaming on-device detection of device directed speech from voice and touch-based invocation
Viaarxiv icon

Analysis and Tuning of a Voice Assistant System for Dysfluent Speech

Add code
Jun 18, 2021
Figure 1 for Analysis and Tuning of a Voice Assistant System for Dysfluent Speech
Figure 2 for Analysis and Tuning of a Voice Assistant System for Dysfluent Speech
Figure 3 for Analysis and Tuning of a Voice Assistant System for Dysfluent Speech
Figure 4 for Analysis and Tuning of a Voice Assistant System for Dysfluent Speech
Viaarxiv icon

SEP-28k: A Dataset for Stuttering Event Detection From Podcasts With People Who Stutter

Add code
Feb 24, 2021
Figure 1 for SEP-28k: A Dataset for Stuttering Event Detection From Podcasts With People Who Stutter
Figure 2 for SEP-28k: A Dataset for Stuttering Event Detection From Podcasts With People Who Stutter
Figure 3 for SEP-28k: A Dataset for Stuttering Event Detection From Podcasts With People Who Stutter
Figure 4 for SEP-28k: A Dataset for Stuttering Event Detection From Podcasts With People Who Stutter
Viaarxiv icon

Knowledge Transfer for Efficient On-device False Trigger Mitigation

Add code
Oct 20, 2020
Figure 1 for Knowledge Transfer for Efficient On-device False Trigger Mitigation
Figure 2 for Knowledge Transfer for Efficient On-device False Trigger Mitigation
Figure 3 for Knowledge Transfer for Efficient On-device False Trigger Mitigation
Figure 4 for Knowledge Transfer for Efficient On-device False Trigger Mitigation
Viaarxiv icon

Audiovisual Speech Synthesis using Tacotron2

Add code
Aug 03, 2020
Figure 1 for Audiovisual Speech Synthesis using Tacotron2
Figure 2 for Audiovisual Speech Synthesis using Tacotron2
Figure 3 for Audiovisual Speech Synthesis using Tacotron2
Figure 4 for Audiovisual Speech Synthesis using Tacotron2
Viaarxiv icon

Self-supervised Learning of Visual Speech Features with Audiovisual Speech Enhancement

Add code
May 06, 2020
Figure 1 for Self-supervised Learning of Visual Speech Features with Audiovisual Speech Enhancement
Figure 2 for Self-supervised Learning of Visual Speech Features with Audiovisual Speech Enhancement
Figure 3 for Self-supervised Learning of Visual Speech Features with Audiovisual Speech Enhancement
Figure 4 for Self-supervised Learning of Visual Speech Features with Audiovisual Speech Enhancement
Viaarxiv icon

Detecting Emotion Primitives from Speech and their use in discerning Categorical Emotions

Add code
Jan 31, 2020
Figure 1 for Detecting Emotion Primitives from Speech and their use in discerning Categorical Emotions
Figure 2 for Detecting Emotion Primitives from Speech and their use in discerning Categorical Emotions
Figure 3 for Detecting Emotion Primitives from Speech and their use in discerning Categorical Emotions
Figure 4 for Detecting Emotion Primitives from Speech and their use in discerning Categorical Emotions
Viaarxiv icon

Multi-task Learning for Speaker Verification and Voice Trigger Detection

Add code
Jan 26, 2020
Figure 1 for Multi-task Learning for Speaker Verification and Voice Trigger Detection
Figure 2 for Multi-task Learning for Speaker Verification and Voice Trigger Detection
Figure 3 for Multi-task Learning for Speaker Verification and Voice Trigger Detection
Viaarxiv icon