Picture for Zhangyue Yin

Zhangyue Yin

Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities?

Add code
Feb 17, 2025
Viaarxiv icon

Error Classification of Large Language Models on Math Word Problems: A Dynamically Adaptive Framework

Add code
Jan 26, 2025
Viaarxiv icon

VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks

Add code
Dec 24, 2024
Viaarxiv icon

Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective

Add code
Dec 18, 2024
Figure 1 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Figure 2 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Figure 3 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Figure 4 for Scaling of Search and Learning: A Roadmap to Reproduce o1 from Reinforcement Learning Perspective
Viaarxiv icon

Unified Active Retrieval for Retrieval Augmented Generation

Add code
Jun 18, 2024
Viaarxiv icon

Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models

Add code
May 21, 2024
Viaarxiv icon

A Survey of Neural Code Intelligence: Paradigms, Advances and Beyond

Add code
Mar 21, 2024
Viaarxiv icon

Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem

Add code
Mar 06, 2024
Figure 1 for Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem
Figure 2 for Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem
Figure 3 for Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem
Figure 4 for Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem
Viaarxiv icon

Turn Waste into Worth: Rectifying Top-$k$ Router of MoE

Add code
Feb 21, 2024
Figure 1 for Turn Waste into Worth: Rectifying Top-$k$ Router of MoE
Figure 2 for Turn Waste into Worth: Rectifying Top-$k$ Router of MoE
Figure 3 for Turn Waste into Worth: Rectifying Top-$k$ Router of MoE
Figure 4 for Turn Waste into Worth: Rectifying Top-$k$ Router of MoE
Viaarxiv icon

Can AI Assistants Know What They Don't Know?

Add code
Jan 28, 2024
Figure 1 for Can AI Assistants Know What They Don't Know?
Figure 2 for Can AI Assistants Know What They Don't Know?
Figure 3 for Can AI Assistants Know What They Don't Know?
Figure 4 for Can AI Assistants Know What They Don't Know?
Viaarxiv icon