Picture for Tushar Khot

Tushar Khot

Latent Factor Models Meets Instructions:Goal-conditioned Latent Factor Discovery without Task Supervision

Add code
Feb 21, 2025
Viaarxiv icon

SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories

Add code
Sep 11, 2024
Viaarxiv icon

AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents

Add code
Jul 26, 2024
Figure 1 for AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
Figure 2 for AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
Figure 3 for AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
Figure 4 for AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
Viaarxiv icon

DiscoveryBench: Towards Data-Driven Discovery with Large Language Models

Add code
Jul 01, 2024
Figure 1 for DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
Figure 2 for DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
Figure 3 for DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
Figure 4 for DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
Viaarxiv icon

Husky: A Unified, Open-Source Language Agent for Multi-Step Reasoning

Add code
Jun 10, 2024
Viaarxiv icon

DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents

Add code
Jun 10, 2024
Figure 1 for DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
Figure 2 for DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
Figure 3 for DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
Figure 4 for DISCOVERYWORLD: A Virtual Environment for Developing and Evaluating Automated Scientific Discovery Agents
Viaarxiv icon

OLMo: Accelerating the Science of Language Models

Add code
Feb 07, 2024
Figure 1 for OLMo: Accelerating the Science of Language Models
Figure 2 for OLMo: Accelerating the Science of Language Models
Figure 3 for OLMo: Accelerating the Science of Language Models
Figure 4 for OLMo: Accelerating the Science of Language Models
Viaarxiv icon

Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs

Add code
Nov 08, 2023
Figure 1 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Figure 2 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Figure 3 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Figure 4 for Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Viaarxiv icon

ADaPT: As-Needed Decomposition and Planning with Language Models

Add code
Nov 08, 2023
Viaarxiv icon

How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources

Add code
Jun 07, 2023
Viaarxiv icon