Picture for Mengshu Sun

Mengshu Sun

Sparse-RL: Breaking the Memory Wall in LLM Reinforcement Learning via Stable Sparse Rollouts

Add code
Jan 15, 2026
Viaarxiv icon

Self-Correction Distillation for Structured Data Question Answering

Add code
Nov 17, 2025
Viaarxiv icon

Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction

Add code
Nov 14, 2025
Figure 1 for Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction
Figure 2 for Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction
Figure 3 for Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction
Figure 4 for Thinker: Training LLMs in Hierarchical Thinking for Deep Search via Multi-Turn Interaction
Viaarxiv icon

SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs

Add code
Jul 23, 2025
Figure 1 for SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs
Figure 2 for SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs
Figure 3 for SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs
Figure 4 for SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs
Viaarxiv icon

SciCUEval: A Comprehensive Dataset for Evaluating Scientific Context Understanding in Large Language Models

Add code
May 21, 2025
Figure 1 for SciCUEval: A Comprehensive Dataset for Evaluating Scientific Context Understanding in Large Language Models
Figure 2 for SciCUEval: A Comprehensive Dataset for Evaluating Scientific Context Understanding in Large Language Models
Figure 3 for SciCUEval: A Comprehensive Dataset for Evaluating Scientific Context Understanding in Large Language Models
Figure 4 for SciCUEval: A Comprehensive Dataset for Evaluating Scientific Context Understanding in Large Language Models
Viaarxiv icon

LookAhead Tuning: Safer Language Models via Partial Answer Previews

Add code
Mar 24, 2025
Figure 1 for LookAhead Tuning: Safer Language Models via Partial Answer Previews
Figure 2 for LookAhead Tuning: Safer Language Models via Partial Answer Previews
Figure 3 for LookAhead Tuning: Safer Language Models via Partial Answer Previews
Figure 4 for LookAhead Tuning: Safer Language Models via Partial Answer Previews
Viaarxiv icon

Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models

Add code
Mar 14, 2025
Figure 1 for Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models
Figure 2 for Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models
Figure 3 for Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models
Figure 4 for Exploring Typographic Visual Prompts Injection Threats in Cross-Modality Generation Models
Viaarxiv icon

Bi'an: A Bilingual Benchmark and Model for Hallucination Detection in Retrieval-Augmented Generation

Add code
Feb 26, 2025
Viaarxiv icon

LightThinker: Thinking Step-by-Step Compression

Add code
Feb 21, 2025
Viaarxiv icon

K-ON: Stacking Knowledge On the Head Layer of Large Language Model

Add code
Feb 10, 2025
Figure 1 for K-ON: Stacking Knowledge On the Head Layer of Large Language Model
Figure 2 for K-ON: Stacking Knowledge On the Head Layer of Large Language Model
Figure 3 for K-ON: Stacking Knowledge On the Head Layer of Large Language Model
Figure 4 for K-ON: Stacking Knowledge On the Head Layer of Large Language Model
Viaarxiv icon