Picture for Zhengyin Du

Zhengyin Du

LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion

Add code
Jan 25, 2025
Figure 1 for LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Figure 2 for LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Figure 3 for LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Figure 4 for LongReason: A Synthetic Long-Context Reasoning Benchmark via Context Expansion
Viaarxiv icon

Agent-R: Training Language Model Agents to Reflect via Iterative Self-Training

Add code
Jan 20, 2025
Viaarxiv icon

ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use

Add code
Jan 07, 2025
Viaarxiv icon

TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use

Add code
Dec 20, 2024
Figure 1 for TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
Figure 2 for TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
Figure 3 for TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
Figure 4 for TL-Training: A Task-Feature-Based Framework for Training Large Language Models in Tool Use
Viaarxiv icon