Picture for Xiaojiang Peng

Xiaojiang Peng

Why We Feel: Breaking Boundaries in Emotional Reasoning with Multimodal Large Language Models

Add code
Apr 10, 2025
Viaarxiv icon

Dynamic Vision Mamba

Add code
Apr 07, 2025
Viaarxiv icon

MPBench: A Comprehensive Multimodal Reasoning Benchmark for Process Errors Identification

Add code
Mar 16, 2025
Viaarxiv icon

PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models

Add code
Mar 16, 2025
Viaarxiv icon

Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training

Add code
Dec 17, 2024
Figure 1 for Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training
Figure 2 for Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training
Figure 3 for Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training
Figure 4 for Faster Vision Mamba is Rebuilt in Minutes via Merged Token Re-training
Viaarxiv icon

Rethinking Structure Learning For Graph Neural Networks

Add code
Nov 12, 2024
Figure 1 for Rethinking Structure Learning For Graph Neural Networks
Figure 2 for Rethinking Structure Learning For Graph Neural Networks
Figure 3 for Rethinking Structure Learning For Graph Neural Networks
Figure 4 for Rethinking Structure Learning For Graph Neural Networks
Viaarxiv icon

Is Graph Convolution Always Beneficial For Every Feature?

Add code
Nov 12, 2024
Figure 1 for Is Graph Convolution Always Beneficial For Every Feature?
Figure 2 for Is Graph Convolution Always Beneficial For Every Feature?
Figure 3 for Is Graph Convolution Always Beneficial For Every Feature?
Figure 4 for Is Graph Convolution Always Beneficial For Every Feature?
Viaarxiv icon

Mamba-Enhanced Text-Audio-Video Alignment Network for Emotion Recognition in Conversations

Add code
Sep 08, 2024
Figure 1 for Mamba-Enhanced Text-Audio-Video Alignment Network for Emotion Recognition in Conversations
Figure 2 for Mamba-Enhanced Text-Audio-Video Alignment Network for Emotion Recognition in Conversations
Figure 3 for Mamba-Enhanced Text-Audio-Video Alignment Network for Emotion Recognition in Conversations
Figure 4 for Mamba-Enhanced Text-Audio-Video Alignment Network for Emotion Recognition in Conversations
Viaarxiv icon

DPDEdit: Detail-Preserved Diffusion Models for Multimodal Fashion Image Editing

Add code
Sep 02, 2024
Viaarxiv icon

Multimodal Multi-turn Conversation Stance Detection: A Challenge Dataset and Effective Model

Add code
Sep 01, 2024
Viaarxiv icon