Picture for Ofir Press

Ofir Press

SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?

Add code
Oct 04, 2024
Figure 1 for SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Figure 2 for SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Figure 3 for SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Figure 4 for SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
Viaarxiv icon

EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges

Add code
Sep 24, 2024
Figure 1 for EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges
Figure 2 for EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges
Figure 3 for EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges
Figure 4 for EnIGMA: Enhanced Interactive Generative Model Agent for CTF Challenges
Viaarxiv icon

AssistantBench: Can Web Agents Solve Realistic and Time-Consuming Tasks?

Add code
Jul 22, 2024
Viaarxiv icon

SciCode: A Research Coding Benchmark Curated by Scientists

Add code
Jul 18, 2024
Figure 1 for SciCode: A Research Coding Benchmark Curated by Scientists
Figure 2 for SciCode: A Research Coding Benchmark Curated by Scientists
Figure 3 for SciCode: A Research Coding Benchmark Curated by Scientists
Figure 4 for SciCode: A Research Coding Benchmark Curated by Scientists
Viaarxiv icon

SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Add code
Oct 10, 2023
Viaarxiv icon

How Language Model Hallucinations Can Snowball

Add code
May 22, 2023
Figure 1 for How Language Model Hallucinations Can Snowball
Figure 2 for How Language Model Hallucinations Can Snowball
Figure 3 for How Language Model Hallucinations Can Snowball
Figure 4 for How Language Model Hallucinations Can Snowball
Viaarxiv icon

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Add code
Nov 09, 2022
Viaarxiv icon

What Language Model to Train if You Have One Million GPU Hours?

Add code
Nov 08, 2022
Viaarxiv icon

Measuring and Narrowing the Compositionality Gap in Language Models

Add code
Oct 07, 2022
Figure 1 for Measuring and Narrowing the Compositionality Gap in Language Models
Figure 2 for Measuring and Narrowing the Compositionality Gap in Language Models
Figure 3 for Measuring and Narrowing the Compositionality Gap in Language Models
Figure 4 for Measuring and Narrowing the Compositionality Gap in Language Models
Viaarxiv icon

Transformer Language Models without Positional Encodings Still Learn Positional Information

Add code
Mar 30, 2022
Figure 1 for Transformer Language Models without Positional Encodings Still Learn Positional Information
Figure 2 for Transformer Language Models without Positional Encodings Still Learn Positional Information
Figure 3 for Transformer Language Models without Positional Encodings Still Learn Positional Information
Figure 4 for Transformer Language Models without Positional Encodings Still Learn Positional Information
Viaarxiv icon