Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Xiang Yao

CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents

Jul 01, 2024

Tianqi Xu, Linyao Chen, Dai-Jie Wu, Yanjun Chen, Zecheng Zhang, Xiang Yao, Zhiqiang Xie, Yongchao Chen, Shilong Liu, Bochen Qian(+3 more)

Figure 1 for CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents

Figure 2 for CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents

Figure 3 for CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents

Figure 4 for CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents

Abstract:The development of autonomous agents increasingly relies on Multimodal Language Models (MLMs) to perform tasks described in natural language with GUI environments, such as websites, desktop computers, or mobile phones. Existing benchmarks for MLM agents in interactive environments are limited by their focus on a single environment, lack of detailed and generalized evaluation methods, and the complexities of constructing tasks and evaluators. To overcome these limitations, we introduce Crab, the first agent benchmark framework designed to support cross-environment tasks, incorporating a graph-based fine-grained evaluation method and an efficient mechanism for task and evaluator construction. Our framework supports multiple devices and can be easily extended to any environment with a Python interface. Leveraging Crab, we developed a cross-platform Crab Benchmark-v0 comprising 100 tasks in computer desktop and mobile phone environments. We evaluated four advanced MLMs using different single and multi-agent system configurations on this benchmark. The experimental results demonstrate that the single agent with GPT-4o achieves the best completion ratio of 35.26%. All framework code, agent code, and task datasets are publicly available at https://github.com/camel-ai/crab.

Via

Access Paper or Ask Questions

A Two-Dimensional Deep Network for RF-based Drone Detection and Identification Towards Secure Coverage Extension

Aug 26, 2023

Zixiao Zhao, Qinghe Du, Xiang Yao, Lei Lu, Shijiao Zhang

Figure 1 for A Two-Dimensional Deep Network for RF-based Drone Detection and Identification Towards Secure Coverage Extension

Figure 2 for A Two-Dimensional Deep Network for RF-based Drone Detection and Identification Towards Secure Coverage Extension

Figure 3 for A Two-Dimensional Deep Network for RF-based Drone Detection and Identification Towards Secure Coverage Extension

Figure 4 for A Two-Dimensional Deep Network for RF-based Drone Detection and Identification Towards Secure Coverage Extension

Abstract:As drones become increasingly prevalent in human life, they also raises security concerns such as unauthorized access and control, as well as collisions and interference with manned aircraft. Therefore, ensuring the ability to accurately detect and identify between different drones holds significant implications for coverage extension. Assisted by machine learning, radio frequency (RF) detection can recognize the type and flight mode of drones based on the sampled drone signals. In this paper, we first utilize Short-Time Fourier. Transform (STFT) to extract two-dimensional features from the raw signals, which contain both time-domain and frequency-domain information. Then, we employ a Convolutional Neural Network (CNN) built with ResNet structure to achieve multi-class classifications. Our experimental results show that the proposed ResNet-STFT can achieve higher accuracy and faster convergence on the extended dataset. Additionally, it exhibits balanced performance compared to other baselines on the raw dataset.

Via

Access Paper or Ask Questions