Picture for Tsz Kin Lam

Tsz Kin Lam

From TOWER to SPIRE: Adding the Speech Modality to a Text-Only LLM

Add code
Mar 13, 2025
Viaarxiv icon

Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison

Add code
Jan 04, 2025
Figure 1 for Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Figure 2 for Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Figure 3 for Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Figure 4 for Prepending or Cross-Attention for Speech-to-Text? An Empirical Comparison
Viaarxiv icon

Findings of the IWSLT 2024 Evaluation Campaign

Add code
Nov 07, 2024
Viaarxiv icon

Pitfalls and Outlooks in Using COMET

Add code
Sep 02, 2024
Figure 1 for Pitfalls and Outlooks in Using COMET
Figure 2 for Pitfalls and Outlooks in Using COMET
Figure 3 for Pitfalls and Outlooks in Using COMET
Figure 4 for Pitfalls and Outlooks in Using COMET
Viaarxiv icon

Compact Speech Translation Models via Discrete Speech Units Pretraining

Add code
Feb 29, 2024
Figure 1 for Compact Speech Translation Models via Discrete Speech Units Pretraining
Figure 2 for Compact Speech Translation Models via Discrete Speech Units Pretraining
Figure 3 for Compact Speech Translation Models via Discrete Speech Units Pretraining
Figure 4 for Compact Speech Translation Models via Discrete Speech Units Pretraining
Viaarxiv icon

Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases

Add code
Feb 01, 2024
Figure 1 for Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases
Figure 2 for Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases
Figure 3 for Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases
Figure 4 for Prosody in Cascade and Direct Speech-to-Text Translation: a case study on Korean Wh-Phrases
Viaarxiv icon

Make More of Your Data: Minimal Effort Data Augmentation for Automatic Speech Recognition and Translation

Add code
Oct 27, 2022
Viaarxiv icon

Analyzing the Use of Influence Functions for Instance-Specific Data Filtering in Neural Machine Translation

Add code
Oct 24, 2022
Viaarxiv icon

Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation

Add code
Mar 16, 2022
Figure 1 for Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation
Figure 2 for Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation
Figure 3 for Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation
Figure 4 for Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation
Viaarxiv icon

On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR

Add code
Apr 03, 2021
Figure 1 for On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR
Figure 2 for On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR
Figure 3 for On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR
Figure 4 for On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR
Viaarxiv icon