Picture for Tom Ko

Tom Ko

Towards Achieving Human Parity on End-to-end Simultaneous Speech Translation via LLM Agent

Add code
Jul 31, 2024
Viaarxiv icon

Learning Retrieval Augmentation for Personalized Dialogue Generation

Add code
Jun 27, 2024
Viaarxiv icon

Selective Prompting Tuning for Personalized Conversations with LLMs

Add code
Jun 26, 2024
Viaarxiv icon

Speech Translation with Large Language Models: An Industrial Practice

Add code
Dec 21, 2023
Viaarxiv icon

RepCodec: A Speech Representation Codec for Speech Tokenization

Add code
Aug 31, 2023
Viaarxiv icon

Recent Advances in Direct Speech-to-text Translation

Add code
Jun 20, 2023
Viaarxiv icon

MOSPC: MOS Prediction Based on Pairwise Comparison

Add code
Jun 18, 2023
Viaarxiv icon

PolyVoice: Language Models for Speech to Speech Translation

Add code
Jun 13, 2023
Figure 1 for PolyVoice: Language Models for Speech to Speech Translation
Figure 2 for PolyVoice: Language Models for Speech to Speech Translation
Figure 3 for PolyVoice: Language Models for Speech to Speech Translation
Figure 4 for PolyVoice: Language Models for Speech to Speech Translation
Viaarxiv icon

CTC-based Non-autoregressive Speech Translation

Add code
May 27, 2023
Figure 1 for CTC-based Non-autoregressive Speech Translation
Figure 2 for CTC-based Non-autoregressive Speech Translation
Figure 3 for CTC-based Non-autoregressive Speech Translation
Figure 4 for CTC-based Non-autoregressive Speech Translation
Viaarxiv icon

DUB: Discrete Unit Back-translation for Speech Translation

Add code
May 19, 2023
Viaarxiv icon