Picture for Siddharth Sigtia

Siddharth Sigtia

A Multimodal Approach to Device-Directed Speech Detection with Large Language Models

Add code
Mar 26, 2024
Viaarxiv icon

Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models

Add code
Dec 06, 2023
Viaarxiv icon

Improving Voice Trigger Detection with Metric Learning

Add code
Apr 05, 2022
Figure 1 for Improving Voice Trigger Detection with Metric Learning
Figure 2 for Improving Voice Trigger Detection with Metric Learning
Figure 3 for Improving Voice Trigger Detection with Metric Learning
Figure 4 for Improving Voice Trigger Detection with Metric Learning
Viaarxiv icon

Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation

Add code
May 14, 2021
Figure 1 for Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation
Figure 2 for Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation
Figure 3 for Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation
Figure 4 for Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation
Viaarxiv icon

Progressive Voice Trigger Detection: Accuracy vs Latency

Add code
Oct 29, 2020
Figure 1 for Progressive Voice Trigger Detection: Accuracy vs Latency
Figure 2 for Progressive Voice Trigger Detection: Accuracy vs Latency
Figure 3 for Progressive Voice Trigger Detection: Accuracy vs Latency
Figure 4 for Progressive Voice Trigger Detection: Accuracy vs Latency
Viaarxiv icon

Hybrid Transformer/CTC Networks for Hardware Efficient Voice Triggering

Add code
Aug 05, 2020
Figure 1 for Hybrid Transformer/CTC Networks for Hardware Efficient Voice Triggering
Figure 2 for Hybrid Transformer/CTC Networks for Hardware Efficient Voice Triggering
Figure 3 for Hybrid Transformer/CTC Networks for Hardware Efficient Voice Triggering
Figure 4 for Hybrid Transformer/CTC Networks for Hardware Efficient Voice Triggering
Viaarxiv icon

Multi-task Learning for Speaker Verification and Voice Trigger Detection

Add code
Jan 26, 2020
Figure 1 for Multi-task Learning for Speaker Verification and Voice Trigger Detection
Figure 2 for Multi-task Learning for Speaker Verification and Voice Trigger Detection
Figure 3 for Multi-task Learning for Speaker Verification and Voice Trigger Detection
Viaarxiv icon

Multi-task Learning for Voice Trigger Detection

Add code
Jan 26, 2020
Figure 1 for Multi-task Learning for Voice Trigger Detection
Figure 2 for Multi-task Learning for Voice Trigger Detection
Figure 3 for Multi-task Learning for Voice Trigger Detection
Viaarxiv icon

Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging

Add code
Nov 29, 2016
Figure 1 for Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging
Figure 2 for Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging
Figure 3 for Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging
Figure 4 for Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging
Viaarxiv icon

Automatic Environmental Sound Recognition: Performance versus Computational Cost

Add code
Jul 15, 2016
Figure 1 for Automatic Environmental Sound Recognition: Performance versus Computational Cost
Figure 2 for Automatic Environmental Sound Recognition: Performance versus Computational Cost
Figure 3 for Automatic Environmental Sound Recognition: Performance versus Computational Cost
Figure 4 for Automatic Environmental Sound Recognition: Performance versus Computational Cost
Viaarxiv icon