Picture for Xiong Wang

Xiong Wang

Qwen2.5-Omni Technical Report

Add code
Mar 26, 2025
Viaarxiv icon

InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training

Add code
Mar 04, 2025
Viaarxiv icon

Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray

Add code
Feb 07, 2025
Viaarxiv icon

LUCY: Linguistic Understanding and Control Yielding Early Stage of Her

Add code
Jan 27, 2025
Viaarxiv icon

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Add code
Jan 03, 2025
Viaarxiv icon

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Add code
Nov 01, 2024
Viaarxiv icon

A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition

Add code
Aug 18, 2024
Viaarxiv icon

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Add code
Aug 09, 2024
Figure 1 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 2 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 3 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 4 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Viaarxiv icon

Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel

Add code
Feb 13, 2024
Figure 1 for Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel
Figure 2 for Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel
Figure 3 for Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel
Figure 4 for Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel
Viaarxiv icon

Optimal minimax rate of learning interaction kernels

Add code
Nov 28, 2023
Viaarxiv icon