Picture for Wenliang Chen

Wenliang Chen

OneEval: Benchmarking LLM Knowledge-intensive Reasoning over Diverse Knowledge Bases

Add code
Jun 14, 2025
Viaarxiv icon

CheMatAgent: Enhancing LLMs for Chemistry and Materials Science through Tree-Search Based Tool Learning

Add code
Jun 12, 2025
Viaarxiv icon

Is Fine-Tuning an Effective Solution? Reassessing Knowledge Editing for Unstructured Data

Add code
Jun 11, 2025
Viaarxiv icon

SELT: Self-Evaluation Tree Search for LLMs with Task Decomposition

Add code
Jun 09, 2025
Viaarxiv icon

ChemAgent: Enhancing LLMs for Chemistry and Materials Science through Tree-Search Based Tool Learning

Add code
Jun 09, 2025
Viaarxiv icon

UAQFact: Evaluating Factual Knowledge Utilization of LLMs on Unanswerable Questions

Add code
May 29, 2025
Viaarxiv icon

Chain-of-Tools: Utilizing Massive Unseen Tools in the CoT Reasoning of Frozen Language Models

Add code
Mar 21, 2025
Viaarxiv icon

Iterative Value Function Optimization for Guided Decoding

Add code
Mar 05, 2025
Viaarxiv icon

NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models

Add code
Oct 15, 2024
Viaarxiv icon

Learning to Refuse: Towards Mitigating Privacy Risks in LLMs

Add code
Jul 14, 2024
Viaarxiv icon