Picture for Lichao Sun

Lichao Sun

Lehigh University

From Correctness to Comprehension: AI Agents for Personalized Error Diagnosis in Education

Add code
Feb 19, 2025
Viaarxiv icon

XAttnMark: Learning Robust Audio Watermarking with Cross-Attention

Add code
Feb 07, 2025
Figure 1 for XAttnMark: Learning Robust Audio Watermarking with Cross-Attention
Figure 2 for XAttnMark: Learning Robust Audio Watermarking with Cross-Attention
Figure 3 for XAttnMark: Learning Robust Audio Watermarking with Cross-Attention
Figure 4 for XAttnMark: Learning Robust Audio Watermarking with Cross-Attention
Viaarxiv icon

CultureVLM: Characterizing and Improving Cultural Understanding of Vision-Language Models for over 100 Countries

Add code
Jan 02, 2025
Viaarxiv icon

Political-LLM: Large Language Models in Political Science

Add code
Dec 09, 2024
Viaarxiv icon

LLaVA-CoT: Let Vision Language Models Reason Step-by-Step

Add code
Nov 25, 2024
Viaarxiv icon

Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination

Add code
Nov 15, 2024
Figure 1 for Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination
Figure 2 for Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination
Figure 3 for Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination
Figure 4 for Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination
Viaarxiv icon

LLaVA-o1: Let Vision Language Models Reason Step-by-Step

Add code
Nov 15, 2024
Viaarxiv icon

SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding

Add code
Nov 08, 2024
Figure 1 for SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding
Figure 2 for SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding
Figure 3 for SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding
Figure 4 for SpecHub: Provable Acceleration to Multi-Draft Speculative Decoding
Viaarxiv icon

Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination

Add code
Nov 06, 2024
Figure 1 for Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination
Figure 2 for Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination
Figure 3 for Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination
Figure 4 for Both Text and Images Leaked! A Systematic Analysis of Multimodal LLM Data Contamination
Viaarxiv icon

Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset

Add code
Nov 05, 2024
Figure 1 for Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Figure 2 for Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Figure 3 for Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Figure 4 for Benchmarking Vision Language Model Unlearning via Fictitious Facial Identity Dataset
Viaarxiv icon