Picture for Alexandre Lacoste

Alexandre Lacoste

Google

The BrowserGym Ecosystem for Web Agent Research

Add code
Dec 10, 2024
Viaarxiv icon

GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks

Add code
Nov 28, 2024
Figure 1 for GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks
Figure 2 for GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks
Figure 3 for GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks
Figure 4 for GEOBench-VLM: Benchmarking Vision-Language Models for Geospatial Tasks
Viaarxiv icon

Context is Key: A Benchmark for Forecasting with Essential Textual Information

Add code
Oct 24, 2024
Viaarxiv icon

InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation

Add code
Jul 08, 2024
Figure 1 for InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation
Figure 2 for InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation
Figure 3 for InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation
Figure 4 for InsightBench: Evaluating Business Analytics Agents Through Multi-Step Insight Generation
Viaarxiv icon

WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks

Add code
Jul 07, 2024
Figure 1 for WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks
Figure 2 for WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks
Figure 3 for WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks
Figure 4 for WorkArena++: Towards Compositional Planning and Reasoning-based Common Knowledge Work Tasks
Viaarxiv icon

WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?

Add code
Mar 12, 2024
Figure 1 for WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
Figure 2 for WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
Figure 3 for WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
Figure 4 for WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks?
Viaarxiv icon

Nonparametric Partial Disentanglement via Mechanism Sparsity: Sparse Actions, Interventions and Sparse Temporal Dependencies

Add code
Jan 10, 2024
Viaarxiv icon

Capture the Flag: Uncovering Data Insights with Large Language Models

Add code
Dec 21, 2023
Viaarxiv icon

GEO-Bench: Toward Foundation Models for Earth Monitoring

Add code
Jun 06, 2023
Figure 1 for GEO-Bench: Toward Foundation Models for Earth Monitoring
Figure 2 for GEO-Bench: Toward Foundation Models for Earth Monitoring
Figure 3 for GEO-Bench: Toward Foundation Models for Earth Monitoring
Figure 4 for GEO-Bench: Toward Foundation Models for Earth Monitoring
Viaarxiv icon

Choreographer: Learning and Adapting Skills in Imagination

Add code
Nov 23, 2022
Viaarxiv icon