Picture for Tian Liang

Tian Liang

Dancing with Critiques: Enhancing LLM Reasoning with Stepwise Natural Language Self-Critique

Add code
Mar 21, 2025
Viaarxiv icon

RaSA: Rank-Sharing Low-Rank Adaptation

Add code
Mar 16, 2025
Viaarxiv icon

The First Few Tokens Are All You Need: An Efficient and Effective Unsupervised Prefix Fine-Tuning Method for Reasoning Models

Add code
Mar 04, 2025
Viaarxiv icon

Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs

Add code
Jan 30, 2025
Figure 1 for Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Figure 2 for Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Figure 3 for Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Figure 4 for Thoughts Are All Over the Place: On the Underthinking of o1-Like LLMs
Viaarxiv icon

Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs

Add code
Dec 30, 2024
Viaarxiv icon

Teaching LLMs to Refine with Tools

Add code
Dec 22, 2024
Figure 1 for Teaching LLMs to Refine with Tools
Figure 2 for Teaching LLMs to Refine with Tools
Figure 3 for Teaching LLMs to Refine with Tools
Figure 4 for Teaching LLMs to Refine with Tools
Viaarxiv icon

Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability

Add code
Dec 02, 2024
Figure 1 for Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability
Figure 2 for Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability
Figure 3 for Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability
Figure 4 for Critical Tokens Matter: Token-Level Contrastive Estimation Enhances LLM's Reasoning Capability
Viaarxiv icon

Draft Model Knows When to Stop: A Self-Verification Length Policy for Speculative Decoding

Add code
Nov 27, 2024
Viaarxiv icon

Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training

Add code
Jul 12, 2024
Figure 1 for Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Figure 2 for Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Figure 3 for Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Figure 4 for Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training
Viaarxiv icon

How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments

Add code
Mar 18, 2024
Figure 1 for How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Figure 2 for How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Figure 3 for How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Figure 4 for How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments
Viaarxiv icon