Picture for Shruti Palaskar

Shruti Palaskar

Multimodal Large Language Models with Fusion Low Rank Adaptation for Device Directed Speech Detection

Add code
Jun 13, 2024
Viaarxiv icon

On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization

Add code
May 24, 2022
Figure 1 for On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Figure 2 for On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Figure 3 for On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Figure 4 for On Advances in Text Generation from Images Beyond Captioning: A Case Study in Self-Rationalization
Viaarxiv icon

Speech Summarization using Restricted Self-Attention

Add code
Oct 12, 2021
Figure 1 for Speech Summarization using Restricted Self-Attention
Figure 2 for Speech Summarization using Restricted Self-Attention
Figure 3 for Speech Summarization using Restricted Self-Attention
Figure 4 for Speech Summarization using Restricted Self-Attention
Viaarxiv icon

How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language

Add code
Aug 18, 2020
Figure 1 for How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
Figure 2 for How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
Figure 3 for How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language
Viaarxiv icon

ASR Error Correction and Domain Adaptation Using Machine Translation

Add code
Mar 13, 2020
Figure 1 for ASR Error Correction and Domain Adaptation Using Machine Translation
Figure 2 for ASR Error Correction and Domain Adaptation Using Machine Translation
Figure 3 for ASR Error Correction and Domain Adaptation Using Machine Translation
Figure 4 for ASR Error Correction and Domain Adaptation Using Machine Translation
Viaarxiv icon

Multimodal Abstractive Summarization for How2 Videos

Add code
Jun 19, 2019
Figure 1 for Multimodal Abstractive Summarization for How2 Videos
Figure 2 for Multimodal Abstractive Summarization for How2 Videos
Figure 3 for Multimodal Abstractive Summarization for How2 Videos
Figure 4 for Multimodal Abstractive Summarization for How2 Videos
Viaarxiv icon

Learned In Speech Recognition: Contextual Acoustic Word Embeddings

Add code
Feb 18, 2019
Figure 1 for Learned In Speech Recognition: Contextual Acoustic Word Embeddings
Figure 2 for Learned In Speech Recognition: Contextual Acoustic Word Embeddings
Figure 3 for Learned In Speech Recognition: Contextual Acoustic Word Embeddings
Figure 4 for Learned In Speech Recognition: Contextual Acoustic Word Embeddings
Viaarxiv icon

Learning from Multiview Correlations in Open-Domain Videos

Add code
Nov 21, 2018
Figure 1 for Learning from Multiview Correlations in Open-Domain Videos
Figure 2 for Learning from Multiview Correlations in Open-Domain Videos
Figure 3 for Learning from Multiview Correlations in Open-Domain Videos
Figure 4 for Learning from Multiview Correlations in Open-Domain Videos
Viaarxiv icon

Multimodal Grounding for Sequence-to-Sequence Speech Recognition

Add code
Nov 09, 2018
Figure 1 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Figure 2 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Figure 3 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Figure 4 for Multimodal Grounding for Sequence-to-Sequence Speech Recognition
Viaarxiv icon

How2: A Large-scale Dataset for Multimodal Language Understanding

Add code
Nov 01, 2018
Figure 1 for How2: A Large-scale Dataset for Multimodal Language Understanding
Figure 2 for How2: A Large-scale Dataset for Multimodal Language Understanding
Figure 3 for How2: A Large-scale Dataset for Multimodal Language Understanding
Figure 4 for How2: A Large-scale Dataset for Multimodal Language Understanding
Viaarxiv icon