Picture for Wang Lin

Wang Lin

Iris: Breaking GUI Complexity with Adaptive Focus and Self-Refining

Add code
Dec 13, 2024
Viaarxiv icon

Bridging the Gap for Test-Time Multimodal Sentiment Analysis

Add code
Dec 10, 2024
Viaarxiv icon

Semantic Alignment for Multimodal Large Language Models

Add code
Aug 23, 2024
Figure 1 for Semantic Alignment for Multimodal Large Language Models
Figure 2 for Semantic Alignment for Multimodal Large Language Models
Figure 3 for Semantic Alignment for Multimodal Large Language Models
Figure 4 for Semantic Alignment for Multimodal Large Language Models
Viaarxiv icon

Instruction Tuning-free Visual Token Complement for Multimodal LLMs

Add code
Aug 09, 2024
Viaarxiv icon

EAGER: Two-Stream Generative Recommender with Behavior-Semantic Collaboration

Add code
Jun 20, 2024
Viaarxiv icon

Non-confusing Generation of Customized Concepts in Diffusion Models

Add code
May 11, 2024
Viaarxiv icon

OpenSR: Open-Modality Speech Recognition via Maintaining Multi-Modality Alignment

Add code
Jun 10, 2023
Viaarxiv icon

MixSpeech: Cross-Modality Self-Learning with Audio-Visual Stream Mixup for Visual Speech Translation and Recognition

Add code
Mar 09, 2023
Viaarxiv icon