Picture for Xu Sun

Xu Sun

Hyperspectral Image Cross-Domain Object Detection Method based on Spectral-Spatial Feature Alignment

Add code
Nov 25, 2024
Viaarxiv icon

Unveiling Molecular Secrets: An LLM-Augmented Linear Model for Explainable and Calibratable Molecular Property Prediction

Add code
Oct 11, 2024
Viaarxiv icon

Temporal Reasoning Transfer from Text to Video

Add code
Oct 08, 2024
Figure 1 for Temporal Reasoning Transfer from Text to Video
Figure 2 for Temporal Reasoning Transfer from Text to Video
Figure 3 for Temporal Reasoning Transfer from Text to Video
Figure 4 for Temporal Reasoning Transfer from Text to Video
Viaarxiv icon

Enhancing Data Quality through Self-learning on Imbalanced Financial Risk Data

Add code
Sep 15, 2024
Viaarxiv icon

Auxiliary-Loss-Free Load Balancing Strategy for Mixture-of-Experts

Add code
Aug 28, 2024
Viaarxiv icon

DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models

Add code
May 31, 2024
Figure 1 for DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
Figure 2 for DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
Figure 3 for DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
Figure 4 for DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
Viaarxiv icon

InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation

Add code
May 24, 2024
Figure 1 for InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation
Figure 2 for InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation
Figure 3 for InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation
Figure 4 for InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation
Viaarxiv icon

LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?

Add code
Apr 16, 2024
Figure 1 for LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?
Figure 2 for LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?
Figure 3 for LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?
Figure 4 for LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?
Viaarxiv icon

Towards Multimodal Video Paragraph Captioning Models Robust to Missing Modality

Add code
Mar 28, 2024
Viaarxiv icon

TempCompass: Do Video LLMs Really Understand Videos?

Add code
Mar 01, 2024
Figure 1 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 2 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 3 for TempCompass: Do Video LLMs Really Understand Videos?
Figure 4 for TempCompass: Do Video LLMs Really Understand Videos?
Viaarxiv icon