Picture for Qingkai Fang

Qingkai Fang

LLaMA-Omni: Seamless Speech Interaction with Large Language Models

Add code
Sep 10, 2024
Figure 1 for LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Figure 2 for LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Figure 3 for LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Figure 4 for LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Viaarxiv icon

CTC-based Non-autoregressive Textless Speech-to-Speech Translation

Add code
Jun 11, 2024
Viaarxiv icon

A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation

Add code
Jun 11, 2024
Viaarxiv icon

Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?

Add code
Jun 11, 2024
Viaarxiv icon

StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning

Add code
Jun 05, 2024
Viaarxiv icon

Bridging the Gap between Synthetic and Authentic Images for Multimodal Machine Translation

Add code
Oct 20, 2023
Viaarxiv icon

DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation

Add code
Oct 11, 2023
Viaarxiv icon

BayLing: Bridging Cross-lingual Alignment and Instruction Following through Interactive Translation for Large Language Models

Add code
Jun 21, 2023
Viaarxiv icon

CMOT: Cross-modal Mixup via Optimal Transport for Speech Translation

Add code
May 25, 2023
Viaarxiv icon

Understanding and Bridging the Modality Gap for Speech Translation

Add code
May 15, 2023
Viaarxiv icon