Picture for Ramon Sanabria

Ramon Sanabria

Transforming LLMs into Cross-modal and Cross-lingual Retrieval Systems

Add code
Apr 04, 2024
Viaarxiv icon

Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study on Speech Emotion Recognition

Add code
Feb 04, 2024
Viaarxiv icon

Acoustic Word Embeddings for Untranscribed Target Languages with Continued Pretraining and Learned Pooling

Add code
Jun 03, 2023
Viaarxiv icon

The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR

Add code
Mar 31, 2023
Viaarxiv icon

Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models

Add code
Oct 28, 2022
Figure 1 for Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models
Figure 2 for Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models
Figure 3 for Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models
Figure 4 for Analyzing Acoustic Word Embeddings from Pre-trained Self-supervised Speech Models
Viaarxiv icon

Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training

Add code
Mar 02, 2022
Figure 1 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Figure 2 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Figure 3 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Figure 4 for Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training
Viaarxiv icon

On the Difficulty of Segmenting Words with Attention

Add code
Sep 21, 2021
Figure 1 for On the Difficulty of Segmenting Words with Attention
Figure 2 for On the Difficulty of Segmenting Words with Attention
Figure 3 for On the Difficulty of Segmenting Words with Attention
Figure 4 for On the Difficulty of Segmenting Words with Attention
Viaarxiv icon

Talk, Don't Write: A Study of Direct Speech-Based Image Retrieval

Add code
Apr 08, 2021
Figure 1 for Talk, Don't Write: A Study of Direct Speech-Based Image Retrieval
Figure 2 for Talk, Don't Write: A Study of Direct Speech-Based Image Retrieval
Figure 3 for Talk, Don't Write: A Study of Direct Speech-Based Image Retrieval
Figure 4 for Talk, Don't Write: A Study of Direct Speech-Based Image Retrieval
Viaarxiv icon

Multimodal Speech Recognition with Unstructured Audio Masking

Add code
Oct 16, 2020
Figure 1 for Multimodal Speech Recognition with Unstructured Audio Masking
Figure 2 for Multimodal Speech Recognition with Unstructured Audio Masking
Figure 3 for Multimodal Speech Recognition with Unstructured Audio Masking
Figure 4 for Multimodal Speech Recognition with Unstructured Audio Masking
Viaarxiv icon

Fine-Grained Grounding for Multimodal Speech Recognition

Add code
Oct 05, 2020
Figure 1 for Fine-Grained Grounding for Multimodal Speech Recognition
Figure 2 for Fine-Grained Grounding for Multimodal Speech Recognition
Figure 3 for Fine-Grained Grounding for Multimodal Speech Recognition
Figure 4 for Fine-Grained Grounding for Multimodal Speech Recognition
Viaarxiv icon