Picture for Shang-Wen Li

Shang-Wen Li

How to Learn a New Language? An Efficient Solution for Self-Supervised Learning Models Unseen Languages Adaption in Low-Resource Scenario

Add code
Nov 27, 2024
Viaarxiv icon

Altogether: Image Captioning via Re-aligning Alt-text

Add code
Oct 22, 2024
Figure 1 for Altogether: Image Captioning via Re-aligning Alt-text
Figure 2 for Altogether: Image Captioning via Re-aligning Alt-text
Figure 3 for Altogether: Image Captioning via Re-aligning Alt-text
Figure 4 for Altogether: Image Captioning via Re-aligning Alt-text
Viaarxiv icon

SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks

Add code
Aug 23, 2024
Figure 1 for SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Figure 2 for SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Figure 3 for SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Figure 4 for SpeechPrompt: Prompting Speech Language Models for Speech Processing Tasks
Viaarxiv icon

Disentangled Training with Adversarial Examples For Robust Small-footprint Keyword Spotting

Add code
Aug 23, 2024
Figure 1 for Disentangled Training with Adversarial Examples For Robust Small-footprint Keyword Spotting
Figure 2 for Disentangled Training with Adversarial Examples For Robust Small-footprint Keyword Spotting
Figure 3 for Disentangled Training with Adversarial Examples For Robust Small-footprint Keyword Spotting
Figure 4 for Disentangled Training with Adversarial Examples For Robust Small-footprint Keyword Spotting
Viaarxiv icon

Text Quality-Based Pruning for Efficient Training of Language Models

Add code
Apr 26, 2024
Viaarxiv icon

MoDE: CLIP Data Experts via Clustering

Add code
Apr 24, 2024
Figure 1 for MoDE: CLIP Data Experts via Clustering
Figure 2 for MoDE: CLIP Data Experts via Clustering
Figure 3 for MoDE: CLIP Data Experts via Clustering
Figure 4 for MoDE: CLIP Data Experts via Clustering
Viaarxiv icon

A Large-Scale Evaluation of Speech Foundation Models

Add code
Apr 15, 2024
Viaarxiv icon

SpeechDPR: End-to-End Spoken Passage Retrieval for Open-Domain Spoken Question Answering

Add code
Jan 24, 2024
Viaarxiv icon

GSQA: An End-to-End Model for Generative Spoken Question Answering

Add code
Dec 25, 2023
Viaarxiv icon

FLAP: Fast Language-Audio Pre-training

Add code
Nov 02, 2023
Viaarxiv icon