Picture for Sicheng Zhao

Sicheng Zhao

University of Michigan

Equivariant Neural Networks for General Linear Symmetries on Lie Algebras

Add code
Oct 27, 2025
Viaarxiv icon

Customizing Visual Emotion Evaluation for MLLMs: An Open-vocabulary, Multifaceted, and Scalable Approach

Add code
Sep 26, 2025
Viaarxiv icon

S$^4$C: Speculative Sampling with Syntactic and Semantic Coherence for Efficient Inference of Large Language Models

Add code
Jun 17, 2025
Viaarxiv icon

DiscoVLA: Discrepancy Reduction in Vision, Language, and Alignment for Parameter-Efficient Video-Text Retrieval

Add code
Jun 10, 2025
Viaarxiv icon

AdaTP: Attention-Debiased Token Pruning for Video Large Language Models

Add code
May 26, 2025
Viaarxiv icon

An Empirical Study on Configuring In-Context Learning Demonstrations for Unleashing MLLMs' Sentimental Perception Capability

Add code
May 22, 2025
Viaarxiv icon

Modality Reliability Guided Multimodal Recommendation

Add code
Apr 23, 2025
Figure 1 for Modality Reliability Guided Multimodal Recommendation
Figure 2 for Modality Reliability Guided Multimodal Recommendation
Figure 3 for Modality Reliability Guided Multimodal Recommendation
Figure 4 for Modality Reliability Guided Multimodal Recommendation
Viaarxiv icon

LLaVA-MLB: Mitigating and Leveraging Attention Bias for Training-Free Video LLMs

Add code
Mar 14, 2025
Figure 1 for LLaVA-MLB: Mitigating and Leveraging Attention Bias for Training-Free Video LLMs
Figure 2 for LLaVA-MLB: Mitigating and Leveraging Attention Bias for Training-Free Video LLMs
Figure 3 for LLaVA-MLB: Mitigating and Leveraging Attention Bias for Training-Free Video LLMs
Figure 4 for LLaVA-MLB: Mitigating and Leveraging Attention Bias for Training-Free Video LLMs
Viaarxiv icon

FastVID: Dynamic Density Pruning for Fast Video Large Language Models

Add code
Mar 14, 2025
Figure 1 for FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Figure 2 for FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Figure 3 for FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Figure 4 for FastVID: Dynamic Density Pruning for Fast Video Large Language Models
Viaarxiv icon

Bridge then Begin Anew: Generating Target-relevant Intermediate Model for Source-free Visual Emotion Adaptation

Add code
Dec 18, 2024
Figure 1 for Bridge then Begin Anew: Generating Target-relevant Intermediate Model for Source-free Visual Emotion Adaptation
Figure 2 for Bridge then Begin Anew: Generating Target-relevant Intermediate Model for Source-free Visual Emotion Adaptation
Figure 3 for Bridge then Begin Anew: Generating Target-relevant Intermediate Model for Source-free Visual Emotion Adaptation
Figure 4 for Bridge then Begin Anew: Generating Target-relevant Intermediate Model for Source-free Visual Emotion Adaptation
Viaarxiv icon