Picture for Shaolei Zhang

Shaolei Zhang

LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token

Add code
Jan 07, 2025
Viaarxiv icon

Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation

Add code
Jan 01, 2025
Figure 1 for Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation
Figure 2 for Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation
Figure 3 for Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation
Figure 4 for Large Language Models Are Read/Write Policy-Makers for Simultaneous Generation
Viaarxiv icon

Auto-RAG: Autonomous Retrieval-Augmented Generation for Large Language Models

Add code
Nov 29, 2024
Viaarxiv icon

BayLing 2: A Multilingual Large Language Model with Efficient Language Alignment

Add code
Nov 25, 2024
Viaarxiv icon

LLaMA-Omni: Seamless Speech Interaction with Large Language Models

Add code
Sep 10, 2024
Figure 1 for LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Figure 2 for LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Figure 3 for LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Figure 4 for LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Viaarxiv icon

Agent-SiMT: Agent-assisted Simultaneous Machine Translation with Large Language Models

Add code
Jun 12, 2024
Viaarxiv icon

Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?

Add code
Jun 11, 2024
Viaarxiv icon

A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation

Add code
Jun 11, 2024
Viaarxiv icon

Decoder-only Streaming Transformer for Simultaneous Translation

Add code
Jun 06, 2024
Figure 1 for Decoder-only Streaming Transformer for Simultaneous Translation
Figure 2 for Decoder-only Streaming Transformer for Simultaneous Translation
Figure 3 for Decoder-only Streaming Transformer for Simultaneous Translation
Figure 4 for Decoder-only Streaming Transformer for Simultaneous Translation
Viaarxiv icon

StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning

Add code
Jun 05, 2024
Viaarxiv icon