Picture for Guang Liu

Guang Liu

OmniMoE: An Efficient MoE by Orchestrating Atomic Experts at Scale

Add code
Feb 05, 2026
Viaarxiv icon

Towards Automated Kernel Generation in the Era of LLMs

Add code
Jan 26, 2026
Viaarxiv icon

CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models

Add code
Jun 09, 2025
Figure 1 for CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models
Figure 2 for CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models
Figure 3 for CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models
Figure 4 for CCI4.0: A Bilingual Pretraining Dataset for Enhancing Reasoning in Large Language Models
Viaarxiv icon

Amplify Adjacent Token Differences: Enhancing Long Chain-of-Thought Reasoning with Shift-FFN

Add code
May 22, 2025
Viaarxiv icon

InCo-DPO: Balancing Distribution Shift and Data Quality for Enhanced Preference Optimization

Add code
Mar 20, 2025
Figure 1 for InCo-DPO: Balancing Distribution Shift and Data Quality for Enhanced Preference Optimization
Figure 2 for InCo-DPO: Balancing Distribution Shift and Data Quality for Enhanced Preference Optimization
Figure 3 for InCo-DPO: Balancing Distribution Shift and Data Quality for Enhanced Preference Optimization
Figure 4 for InCo-DPO: Balancing Distribution Shift and Data Quality for Enhanced Preference Optimization
Viaarxiv icon

Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need

Add code
Dec 10, 2024
Figure 1 for Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need
Figure 2 for Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need
Figure 3 for Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need
Figure 4 for Predictable Emergent Abilities of LLMs: Proxy Tasks Are All You Need
Viaarxiv icon

LLaSA: Large Language and Structured Data Assistant

Add code
Nov 16, 2024
Figure 1 for LLaSA: Large Language and Structured Data Assistant
Figure 2 for LLaSA: Large Language and Structured Data Assistant
Figure 3 for LLaSA: Large Language and Structured Data Assistant
Figure 4 for LLaSA: Large Language and Structured Data Assistant
Viaarxiv icon

Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data

Add code
Oct 24, 2024
Figure 1 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Figure 2 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Figure 3 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Figure 4 for Infinity-MM: Scaling Multimodal Performance with Large-Scale and High-Quality Instruction Data
Viaarxiv icon

CCI3.0-HQ: a large-scale Chinese dataset of high quality designed for pre-training large language models

Add code
Oct 24, 2024
Viaarxiv icon

ReTok: Replacing Tokenizer to Enhance Representation Efficiency in Large Language Model

Add code
Oct 06, 2024
Viaarxiv icon