Picture for Xiaoyang Chen

Xiaoyang Chen

NOVA: A Verification-Aware Agent Harness for Architecture Evolution in Industrial Recommender Systems

Add code
Jun 25, 2026
Viaarxiv icon

Towards Verifiable Agentic Data Science: Solving Irregular TSQA Via Tool-Grounded Reasoning

Add code
Jun 13, 2026
Viaarxiv icon

Moonshine: An Autonomous Mathematical Research Agent Centered on Conjecture Generation

Add code
Jun 09, 2026
Viaarxiv icon

Revisiting Articulated Parts Perception in Robot Manipulation

Add code
Jun 06, 2026
Viaarxiv icon

Lectures on AI for Mathematics

Add code
Apr 13, 2026
Viaarxiv icon

Can LLM generate interesting mathematical research problems?

Add code
Mar 19, 2026
Viaarxiv icon

Rank4Gen: RAG-Preference-Aligned Document Set Selection and Ranking

Add code
Jan 16, 2026
Viaarxiv icon

Accelerating Multi-modal LLM Gaming Performance via Input Prediction and Mishit Correction

Add code
Dec 19, 2025
Viaarxiv icon

Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report

Add code
Jul 22, 2025
Figure 1 for Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report
Figure 2 for Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report
Figure 3 for Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report
Figure 4 for Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report
Viaarxiv icon

DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models

Add code
May 13, 2025
Figure 1 for DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models
Figure 2 for DeepMath-Creative: A Benchmark for Evaluating Mathematical Creativity of Large Language Models
Viaarxiv icon