Picture for Kun Shao

Kun Shao

ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning

Add code
Feb 22, 2025
Viaarxiv icon

VSC-RL: Advancing Autonomous Vision-Language Agents with Variational Subgoal-Conditioned Reinforcement Learning

Add code
Feb 11, 2025
Viaarxiv icon

AppVLM: A Lightweight Vision Language Model for Online App Control

Add code
Feb 10, 2025
Figure 1 for AppVLM: A Lightweight Vision Language Model for Online App Control
Figure 2 for AppVLM: A Lightweight Vision Language Model for Online App Control
Figure 3 for AppVLM: A Lightweight Vision Language Model for Online App Control
Figure 4 for AppVLM: A Lightweight Vision Language Model for Online App Control
Viaarxiv icon

GUI Agents with Foundation Models: A Comprehensive Survey

Add code
Nov 07, 2024
Figure 1 for GUI Agents with Foundation Models: A Comprehensive Survey
Figure 2 for GUI Agents with Foundation Models: A Comprehensive Survey
Viaarxiv icon

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Add code
Nov 05, 2024
Figure 1 for Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Figure 2 for Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Figure 3 for Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Figure 4 for Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Viaarxiv icon

Lightweight Neural App Control

Add code
Oct 23, 2024
Viaarxiv icon

SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation

Add code
Oct 19, 2024
Figure 1 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 2 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 3 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 4 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Viaarxiv icon

DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents

Add code
Oct 18, 2024
Figure 1 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Figure 2 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Figure 3 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Figure 4 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Viaarxiv icon

Learning Precise Affordances from Egocentric Videos for Robotic Manipulation

Add code
Aug 19, 2024
Figure 1 for Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Figure 2 for Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Figure 3 for Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Figure 4 for Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Viaarxiv icon

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Add code
Jun 28, 2024
Figure 1 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 2 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 3 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 4 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Viaarxiv icon