Picture for Xingwu Sun

Xingwu Sun

Scaling Laws for Floating Point Quantization Training

Add code
Jan 05, 2025
Viaarxiv icon

Enhancing Contrastive Learning Inspired by the Philosophy of "The Blind Men and the Elephant"

Add code
Dec 21, 2024
Viaarxiv icon

DHCP: Detecting Hallucinations by Cross-modal Attention Pattern in Large Vision-Language Models

Add code
Nov 27, 2024
Viaarxiv icon

Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization

Add code
Nov 15, 2024
Viaarxiv icon

More Expressive Attention with Negative Weights

Add code
Nov 14, 2024
Viaarxiv icon

Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent

Add code
Nov 05, 2024
Figure 1 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 2 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 3 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Figure 4 for Hunyuan-Large: An Open-Source MoE Model with 52 Billion Activated Parameters by Tencent
Viaarxiv icon

Continuous Speech Tokenizer in Text To Speech

Add code
Oct 22, 2024
Viaarxiv icon

Exploring Forgetting in Large Language Model Pre-Training

Add code
Oct 22, 2024
Viaarxiv icon

Lossless KV Cache Compression to 2%

Add code
Oct 20, 2024
Viaarxiv icon

RosePO: Aligning LLM-based Recommenders with Human Values

Add code
Oct 16, 2024
Figure 1 for RosePO: Aligning LLM-based Recommenders with Human Values
Figure 2 for RosePO: Aligning LLM-based Recommenders with Human Values
Figure 3 for RosePO: Aligning LLM-based Recommenders with Human Values
Figure 4 for RosePO: Aligning LLM-based Recommenders with Human Values
Viaarxiv icon