Picture for Yuping Wang

Yuping Wang

Can Large Vision Language Models Read Maps Like a Human?

Add code
Mar 18, 2025
Viaarxiv icon

FwNet-ECA: Facilitating Window Attention with Global Receptive Fields through Fourier Filtering Operations

Add code
Feb 25, 2025
Viaarxiv icon

Re-Align: Aligning Vision Language Models via Retrieval-Augmented Direct Preference Optimization

Add code
Feb 18, 2025
Viaarxiv icon

DiTAR: Diffusion Transformer Autoregressive Modeling for Speech Generation

Add code
Feb 06, 2025
Viaarxiv icon

A Privacy-Preserving Domain Adversarial Federated learning for multi-site brain functional connectivity analysis

Add code
Feb 03, 2025
Viaarxiv icon

Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model

Add code
Jan 13, 2025
Figure 1 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 2 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 3 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Figure 4 for Audio-CoT: Exploring Chain-of-Thought Reasoning in Large Audio Language Model
Viaarxiv icon

OpenEMMA: Open-Source Multimodal Model for End-to-End Autonomous Driving

Add code
Dec 19, 2024
Viaarxiv icon

Seed-Music: A Unified Framework for High Quality and Controlled Music Generation

Add code
Sep 13, 2024
Figure 1 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 2 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 3 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Figure 4 for Seed-Music: A Unified Framework for High Quality and Controlled Music Generation
Viaarxiv icon

MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN

Add code
Sep 11, 2024
Figure 1 for MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN
Figure 2 for MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN
Figure 3 for MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN
Figure 4 for MA-CDMR: An Intelligent Cross-domain Multicast Routing Method based on Multiagent Deep Reinforcement Learning in Multi-domain SDWN
Viaarxiv icon

Language Model Can Listen While Speaking

Add code
Aug 05, 2024
Figure 1 for Language Model Can Listen While Speaking
Figure 2 for Language Model Can Listen While Speaking
Figure 3 for Language Model Can Listen While Speaking
Figure 4 for Language Model Can Listen While Speaking
Viaarxiv icon