Picture for Xiangru Tang

Xiangru Tang

EvoClaw: Evaluating AI Agents on Continuous Software Evolution

Add code
Mar 13, 2026
Viaarxiv icon

LatentChem: From Textual CoT to Latent Thinking in Chemical Reasoning

Add code
Feb 06, 2026
Viaarxiv icon

Why Reasoning Fails to Plan: A Planning-Centric Analysis of Long-Horizon Decision Making in LLM Agents

Add code
Jan 29, 2026
Viaarxiv icon

Molecular Representations in Implicit Functional Space via Hyper-Networks

Add code
Jan 29, 2026
Viaarxiv icon

Agentic Reasoning for Large Language Models

Add code
Jan 18, 2026
Viaarxiv icon

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Add code
Dec 18, 2025
Viaarxiv icon

Agent KB: Leveraging Cross-Domain Experience for Agentic Problem Solving

Add code
Jul 08, 2025
Viaarxiv icon

SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks

Add code
Jul 01, 2025
Figure 1 for SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
Figure 2 for SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
Figure 3 for SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
Figure 4 for SciArena: An Open Evaluation Platform for Foundation Models in Scientific Literature Tasks
Viaarxiv icon

Scaling Test-time Compute for LLM Agents

Add code
Jun 15, 2025
Viaarxiv icon

Med-PRM: Medical Reasoning Models with Stepwise, Guideline-verified Process Rewards

Add code
Jun 13, 2025
Viaarxiv icon