Picture for Yuxuan Wang

Yuxuan Wang

Vehicles, Pedestrians, and E-bikes: a Three-party Game at Right-turn-on-red Crossroads Revealing the Dual and Irrational Role of E-bikes that Risks Traffic Safety

Add code
Nov 04, 2024
Viaarxiv icon

Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis

Add code
Nov 02, 2024
Figure 1 for Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis
Figure 2 for Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis
Figure 3 for Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis
Figure 4 for Fish-Speech: Leveraging Large Language Models for Advanced Multilingual Text-to-Speech Synthesis
Viaarxiv icon

IntrinsicVoice: Empowering LLMs with Intrinsic Real-time Voice Interaction Abilities

Add code
Oct 09, 2024
Figure 1 for IntrinsicVoice: Empowering LLMs with Intrinsic Real-time Voice Interaction Abilities
Figure 2 for IntrinsicVoice: Empowering LLMs with Intrinsic Real-time Voice Interaction Abilities
Figure 3 for IntrinsicVoice: Empowering LLMs with Intrinsic Real-time Voice Interaction Abilities
Figure 4 for IntrinsicVoice: Empowering LLMs with Intrinsic Real-time Voice Interaction Abilities
Viaarxiv icon

Metadata Matters for Time Series: Informative Forecasting with Transformers

Add code
Oct 04, 2024
Figure 1 for Metadata Matters for Time Series: Informative Forecasting with Transformers
Figure 2 for Metadata Matters for Time Series: Informative Forecasting with Transformers
Figure 3 for Metadata Matters for Time Series: Informative Forecasting with Transformers
Figure 4 for Metadata Matters for Time Series: Informative Forecasting with Transformers
Viaarxiv icon

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Add code
Sep 13, 2024
Figure 1 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 2 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 3 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 4 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Viaarxiv icon

NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training

Add code
Sep 13, 2024
Figure 1 for NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training
Figure 2 for NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training
Figure 3 for NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training
Figure 4 for NEST-RQ: Next Token Prediction for Speech Self-Supervised Pre-Training
Viaarxiv icon

VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges

Add code
Sep 02, 2024
Figure 1 for VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges
Figure 2 for VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges
Figure 3 for VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges
Figure 4 for VideoLLaMB: Long-context Video Understanding with Recurrent Memory Bridges
Viaarxiv icon

Understanding Multimodal Hallucination with Parameter-Free Representation Alignment

Add code
Sep 02, 2024
Viaarxiv icon

Language Model Can Listen While Speaking

Add code
Aug 05, 2024
Viaarxiv icon

ExoViP: Step-by-step Verification and Exploration with Exoskeleton Modules for Compositional Visual Reasoning

Add code
Aug 05, 2024
Viaarxiv icon