Picture for Xiangru Tang

Xiangru Tang

ChemAgent: Self-updating Library in Large Language Models Improves Chemical Reasoning

Add code
Jan 11, 2025
Viaarxiv icon

ChemSafetyBench: Benchmarking LLM Safety on Chemistry Domain

Add code
Nov 23, 2024
Viaarxiv icon

FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents

Add code
Nov 08, 2024
Figure 1 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Figure 2 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Figure 3 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Figure 4 for FinDVer: Explainable Claim Verification over Long and Hybrid-Content Financial Documents
Viaarxiv icon

OpenDevin: An Open Platform for AI Software Developers as Generalist Agents

Add code
Jul 23, 2024
Figure 1 for OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
Figure 2 for OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
Figure 3 for OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
Figure 4 for OpenDevin: An Open Platform for AI Software Developers as Generalist Agents
Viaarxiv icon

Step-Back Profiling: Distilling User History for Personalized Scientific Writing

Add code
Jun 20, 2024
Viaarxiv icon

Unveiling the Spectrum of Data Contamination in Language Models: A Survey from Detection to Remediation

Add code
Jun 20, 2024
Figure 1 for Unveiling the Spectrum of Data Contamination in Language Models: A Survey from Detection to Remediation
Figure 2 for Unveiling the Spectrum of Data Contamination in Language Models: A Survey from Detection to Remediation
Figure 3 for Unveiling the Spectrum of Data Contamination in Language Models: A Survey from Detection to Remediation
Viaarxiv icon

PRESTO: Progressive Pretraining Enhances Synthetic Chemistry Outcomes

Add code
Jun 19, 2024
Viaarxiv icon

Lessons from the Trenches on Reproducible Evaluation of Language Models

Add code
May 23, 2024
Figure 1 for Lessons from the Trenches on Reproducible Evaluation of Language Models
Figure 2 for Lessons from the Trenches on Reproducible Evaluation of Language Models
Figure 3 for Lessons from the Trenches on Reproducible Evaluation of Language Models
Figure 4 for Lessons from the Trenches on Reproducible Evaluation of Language Models
Viaarxiv icon

MIMIR: A Streamlined Platform for Personalized Agent Tuning in Domain Expertise

Add code
Apr 03, 2024
Viaarxiv icon

Data Interpreter: An LLM Agent For Data Science

Add code
Mar 12, 2024
Figure 1 for Data Interpreter: An LLM Agent For Data Science
Figure 2 for Data Interpreter: An LLM Agent For Data Science
Figure 3 for Data Interpreter: An LLM Agent For Data Science
Figure 4 for Data Interpreter: An LLM Agent For Data Science
Viaarxiv icon