Picture for Kun Shao

Kun Shao

GUI Agents with Foundation Models: A Comprehensive Survey

Add code
Nov 07, 2024
Figure 1 for GUI Agents with Foundation Models: A Comprehensive Survey
Figure 2 for GUI Agents with Foundation Models: A Comprehensive Survey
Viaarxiv icon

Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level

Add code
Nov 05, 2024
Figure 1 for Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Figure 2 for Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Figure 3 for Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Figure 4 for Large Language Models Orchestrating Structured Reasoning Achieve Kaggle Grandmaster Level
Viaarxiv icon

Lightweight Neural App Control

Add code
Oct 23, 2024
Viaarxiv icon

SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation

Add code
Oct 19, 2024
Figure 1 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 2 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 3 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Figure 4 for SPA-Bench: A Comprehensive Benchmark for SmartPhone Agent Evaluation
Viaarxiv icon

DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents

Add code
Oct 18, 2024
Figure 1 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Figure 2 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Figure 3 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Figure 4 for DistRL: An Asynchronous Distributed Reinforcement Learning Framework for On-Device Control Agents
Viaarxiv icon

Learning Precise Affordances from Egocentric Videos for Robotic Manipulation

Add code
Aug 19, 2024
Figure 1 for Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Figure 2 for Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Figure 3 for Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Figure 4 for Learning Precise Affordances from Egocentric Videos for Robotic Manipulation
Viaarxiv icon

ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning

Add code
Jun 28, 2024
Figure 1 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 2 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 3 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Figure 4 for ROS-LLM: A ROS framework for embodied AI with task feedback and structured reasoning
Viaarxiv icon

SFANet: Spatial-Frequency Attention Network for Weather Forecasting

Add code
May 29, 2024
Figure 1 for SFANet: Spatial-Frequency Attention Network for Weather Forecasting
Figure 2 for SFANet: Spatial-Frequency Attention Network for Weather Forecasting
Figure 3 for SFANet: Spatial-Frequency Attention Network for Weather Forecasting
Figure 4 for SFANet: Spatial-Frequency Attention Network for Weather Forecasting
Viaarxiv icon

Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control

Add code
Feb 09, 2024
Figure 1 for Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Figure 2 for Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Figure 3 for Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Figure 4 for Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control
Viaarxiv icon

Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning

Add code
Dec 22, 2023
Viaarxiv icon