Picture for Xiong Wang

Xiong Wang

Long-VITA: Scaling Large Multi-modal Models to 1 Million Tokens with Leading Short-Context Accuray

Add code
Feb 07, 2025
Viaarxiv icon

LUCY: Linguistic Understanding and Control Yielding Early Stage of Her

Add code
Jan 27, 2025
Viaarxiv icon

VITA-1.5: Towards GPT-4o Level Real-Time Vision and Speech Interaction

Add code
Jan 03, 2025
Viaarxiv icon

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

Add code
Nov 01, 2024
Viaarxiv icon

A Transcription Prompt-based Efficient Audio Large Language Model for Robust Speech Recognition

Add code
Aug 18, 2024
Viaarxiv icon

VITA: Towards Open-Source Interactive Omni Multimodal LLM

Add code
Aug 09, 2024
Figure 1 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 2 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 3 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Figure 4 for VITA: Towards Open-Source Interactive Omni Multimodal LLM
Viaarxiv icon

Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel

Add code
Feb 13, 2024
Figure 1 for Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel
Figure 2 for Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel
Figure 3 for Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel
Figure 4 for Interacting Particle Systems on Networks: joint inference of the network and the interaction kernel
Viaarxiv icon

Optimal minimax rate of learning interaction kernels

Add code
Nov 28, 2023
Viaarxiv icon

FedSN: A General Federated Learning Framework over LEO Satellite Networks

Add code
Nov 02, 2023
Viaarxiv icon

Fusing Monocular Images and Sparse IMU Signals for Real-time Human Motion Capture

Add code
Sep 01, 2023
Viaarxiv icon