Picture for Ke Shen

Ke Shen

RLAnything: Forge Environment, Policy, and Reward Model in Completely Dynamic RL System

Add code
Feb 02, 2026
Viaarxiv icon

GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

Add code
Dec 23, 2025
Viaarxiv icon

AutoTool: Dynamic Tool Selection and Integration for Agentic Reasoning

Add code
Dec 15, 2025
Viaarxiv icon

Virtual Width Networks

Add code
Nov 17, 2025
Figure 1 for Virtual Width Networks
Figure 2 for Virtual Width Networks
Figure 3 for Virtual Width Networks
Figure 4 for Virtual Width Networks
Viaarxiv icon

Revolutionizing Reinforcement Learning Framework for Diffusion Large Language Models

Add code
Sep 08, 2025
Viaarxiv icon

ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs

Add code
Jun 23, 2025
Figure 1 for ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs
Figure 2 for ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs
Figure 3 for ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs
Figure 4 for ReasonFlux-PRM: Trajectory-Aware PRMs for Long Chain-of-Thought Reasoning in LLMs
Viaarxiv icon

MMaDA: Multimodal Large Diffusion Language Models

Add code
May 21, 2025
Viaarxiv icon

AttentionInfluence: Adopting Attention Head Influence for Weak-to-Strong Pretraining Data Selection

Add code
May 12, 2025
Viaarxiv icon

Seed1.5-VL Technical Report

Add code
May 11, 2025
Figure 1 for Seed1.5-VL Technical Report
Figure 2 for Seed1.5-VL Technical Report
Figure 3 for Seed1.5-VL Technical Report
Figure 4 for Seed1.5-VL Technical Report
Viaarxiv icon

Unveiling Downstream Performance Scaling of LLMs: A Clustering-Based Perspective

Add code
Feb 24, 2025
Viaarxiv icon