Picture for Kai Chen

Kai Chen

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Add code
Sep 26, 2025
Viaarxiv icon

ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data

Add code
Sep 18, 2025
Figure 1 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 2 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 3 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Figure 4 for ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data
Viaarxiv icon

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

Add code
Aug 27, 2025
Viaarxiv icon

Building Self-Evolving Agents via Experience-Driven Lifelong Learning: A Framework and Benchmark

Add code
Aug 26, 2025
Viaarxiv icon

InternVL3.5: Advancing Open-Source Multimodal Models in Versatility, Reasoning, and Efficiency

Add code
Aug 25, 2025
Viaarxiv icon

InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling

Add code
Aug 12, 2025
Figure 1 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Figure 2 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Figure 3 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Figure 4 for InternBootcamp Technical Report: Boosting LLM Reasoning with Verifiable Task Scaling
Viaarxiv icon

Undress to Redress: A Training-Free Framework for Virtual Try-On

Add code
Aug 11, 2025
Viaarxiv icon

CharacterShot: Controllable and Consistent 4D Character Animation

Add code
Aug 10, 2025
Viaarxiv icon

IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards

Add code
Aug 06, 2025
Viaarxiv icon

LLM-Crowdsourced: A Benchmark-Free Paradigm for Mutual Evaluation of Large Language Models

Add code
Jul 30, 2025
Viaarxiv icon