Picture for Dongzhan Zhou

Dongzhan Zhou

SciEvalKit: An Open-source Evaluation Toolkit for Scientific General Intelligence

Add code
Dec 30, 2025
Viaarxiv icon

SCP: Accelerating Discovery with a Global Web of Autonomous Scientific Agents

Add code
Dec 30, 2025
Viaarxiv icon

MiST: Understanding the Role of Mid-Stage Scientific Training in Developing Chemical Reasoning Models

Add code
Dec 24, 2025
Figure 1 for MiST: Understanding the Role of Mid-Stage Scientific Training in Developing Chemical Reasoning Models
Figure 2 for MiST: Understanding the Role of Mid-Stage Scientific Training in Developing Chemical Reasoning Models
Figure 3 for MiST: Understanding the Role of Mid-Stage Scientific Training in Developing Chemical Reasoning Models
Figure 4 for MiST: Understanding the Role of Mid-Stage Scientific Training in Developing Chemical Reasoning Models
Viaarxiv icon

An Agentic Framework for Autonomous Materials Computation

Add code
Dec 22, 2025
Figure 1 for An Agentic Framework for Autonomous Materials Computation
Figure 2 for An Agentic Framework for Autonomous Materials Computation
Figure 3 for An Agentic Framework for Autonomous Materials Computation
Figure 4 for An Agentic Framework for Autonomous Materials Computation
Viaarxiv icon

Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows

Add code
Dec 18, 2025
Viaarxiv icon

Single-Agent Scaling Fails Multi-Agent Intelligence: Towards Foundation Models with Native Multi-Agent Intelligence

Add code
Dec 16, 2025
Viaarxiv icon

Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks

Add code
Nov 19, 2025
Figure 1 for Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Figure 2 for Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Figure 3 for Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Figure 4 for Reasoning via Video: The First Evaluation of Video Models' Reasoning Abilities through Maze-Solving Tasks
Viaarxiv icon

P1: Mastering Physics Olympiads with Reinforcement Learning

Add code
Nov 17, 2025
Viaarxiv icon

AtomWorld: A Benchmark for Evaluating Spatial Reasoning in Large Language Models on Crystalline Materials

Add code
Oct 06, 2025
Figure 1 for AtomWorld: A Benchmark for Evaluating Spatial Reasoning in Large Language Models on Crystalline Materials
Figure 2 for AtomWorld: A Benchmark for Evaluating Spatial Reasoning in Large Language Models on Crystalline Materials
Figure 3 for AtomWorld: A Benchmark for Evaluating Spatial Reasoning in Large Language Models on Crystalline Materials
Figure 4 for AtomWorld: A Benchmark for Evaluating Spatial Reasoning in Large Language Models on Crystalline Materials
Viaarxiv icon

ChemBOMAS: Accelerated BO in Chemistry with LLM-Enhanced Multi-Agent System

Add code
Sep 10, 2025
Viaarxiv icon