Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Felix Wick

EnvBridge: Bridging Diverse Environments with Cross-Environment Knowledge Transfer for Embodied AI

Oct 22, 2024

Tomoyuki Kagaya, Yuxuan Lou, Thong Jing Yuan, Subramanian Lakshmi, Jayashree Karlekar, Sugiri Pranata, Natsuki Murakami, Akira Kinose, Koki Oguri, Felix Wick(+1 more)

Figure 1 for EnvBridge: Bridging Diverse Environments with Cross-Environment Knowledge Transfer for Embodied AI

Figure 2 for EnvBridge: Bridging Diverse Environments with Cross-Environment Knowledge Transfer for Embodied AI

Figure 3 for EnvBridge: Bridging Diverse Environments with Cross-Environment Knowledge Transfer for Embodied AI

Figure 4 for EnvBridge: Bridging Diverse Environments with Cross-Environment Knowledge Transfer for Embodied AI

Abstract:In recent years, Large Language Models (LLMs) have demonstrated high reasoning capabilities, drawing attention for their applications as agents in various decision-making processes. One notably promising application of LLM agents is robotic manipulation. Recent research has shown that LLMs can generate text planning or control code for robots, providing substantial flexibility and interaction capabilities. However, these methods still face challenges in terms of flexibility and applicability across different environments, limiting their ability to adapt autonomously. Current approaches typically fall into two categories: those relying on environment-specific policy training, which restricts their transferability, and those generating code actions based on fixed prompts, which leads to diminished performance when confronted with new environments. These limitations significantly constrain the generalizability of agents in robotic manipulation. To address these limitations, we propose a novel method called EnvBridge. This approach involves the retention and transfer of successful robot control codes from source environments to target environments. EnvBridge enhances the agent's adaptability and performance across diverse settings by leveraging insights from multiple environments. Notably, our approach alleviates environmental constraints, offering a more flexible and generalizable solution for robotic manipulation tasks. We validated the effectiveness of our method using robotic manipulation benchmarks: RLBench, MetaWorld, and CALVIN. Our experiments demonstrate that LLM agents can successfully leverage diverse knowledge sources to solve complex tasks. Consequently, our approach significantly enhances the adaptability and robustness of robotic manipulation agents in planning across diverse environments.

Via

Access Paper or Ask Questions

RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents

Feb 06, 2024

Tomoyuki Kagaya, Thong Jing Yuan, Yuxuan Lou, Jayashree Karlekar, Sugiri Pranata, Akira Kinose, Koki Oguri, Felix Wick, Yang You

Figure 1 for RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents

Figure 2 for RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents

Figure 3 for RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents

Figure 4 for RAP: Retrieval-Augmented Planning with Contextual Memory for Multimodal LLM Agents

Abstract:Owing to recent advancements, Large Language Models (LLMs) can now be deployed as agents for increasingly complex decision-making applications in areas including robotics, gaming, and API integration. However, reflecting past experiences in current decision-making processes, an innate human behavior, continues to pose significant challenges. Addressing this, we propose Retrieval-Augmented Planning (RAP) framework, designed to dynamically leverage past experiences corresponding to the current situation and context, thereby enhancing agents' planning capabilities. RAP distinguishes itself by being versatile: it excels in both text-only and multimodal environments, making it suitable for a wide range of tasks. Empirical evaluations demonstrate RAP's effectiveness, where it achieves SOTA performance in textual scenarios and notably enhances multimodal LLM agents' performance for embodied tasks. These results highlight RAP's potential in advancing the functionality and applicability of LLM agents in complex, real-world applications.

Via

Access Paper or Ask Questions

Cyclic Boosting -- an explainable supervised machine learning algorithm

Feb 09, 2020

Felix Wick, Ulrich Kerzel, Michael Feindt

Figure 1 for Cyclic Boosting -- an explainable supervised machine learning algorithm

Figure 2 for Cyclic Boosting -- an explainable supervised machine learning algorithm

Figure 3 for Cyclic Boosting -- an explainable supervised machine learning algorithm

Abstract:Supervised machine learning algorithms have seen spectacular advances and surpassed human level performance in a wide range of specific applications. However, using complex ensemble or deep learning algorithms typically results in black box models, where the path leading to individual predictions cannot be followed in detail. In order to address this issue, we propose the novel "Cyclic Boosting" machine learning algorithm, which allows to efficiently perform accurate regression and classification tasks while at the same time allowing a detailed understanding of how each individual prediction was made.

* Accepted at ICMLA 2019

Via

Access Paper or Ask Questions