Picture for Maosong Sun

Maosong Sun

A Multi-Power Law for Loss Curve Prediction Across Learning Rate Schedules

Add code
Mar 17, 2025
Viaarxiv icon

DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding

Add code
Mar 17, 2025
Viaarxiv icon

Towards Self-Improving Systematic Cognition for Next-Generation Foundation MLLMs

Add code
Mar 16, 2025
Viaarxiv icon

Cost-Optimal Grouped-Query Attention for Long-Context LLMs

Add code
Mar 12, 2025
Viaarxiv icon

Judge as A Judge: Improving the Evaluation of Retrieval-Augmented Generation through the Judge-Consistency of Large Language Models

Add code
Feb 26, 2025
Viaarxiv icon

NotaGen: Advancing Musicality in Symbolic Music Generation with Large Language Model Training Paradigms

Add code
Feb 26, 2025
Viaarxiv icon

Learning to Generate Structured Output with Schema Reinforcement Learning

Add code
Feb 26, 2025
Viaarxiv icon

AgentRM: Enhancing Agent Generalization with Reward Modeling

Add code
Feb 25, 2025
Viaarxiv icon

HIPPO: Enhancing the Table Understanding Capability of Large Language Models through Hybrid-Modal Preference Optimization

Add code
Feb 24, 2025
Viaarxiv icon

PIP-KAG: Mitigating Knowledge Conflicts in Knowledge-Augmented Generation via Parametric Pruning

Add code
Feb 21, 2025
Viaarxiv icon