Picture for Koushik Sen

Koushik Sen

optimize_anything: A Universal API for Optimizing any Text Parameter

Add code
May 19, 2026
Viaarxiv icon

Do Androids Dream of Breaking the Game? Systematically Auditing AI Agent Benchmarks with BenchJack

Add code
May 12, 2026
Viaarxiv icon

AdaEvolve: Adaptive LLM Driven Zeroth-Order Optimization

Add code
Feb 23, 2026
Viaarxiv icon

Let the Barbarians In: How AI Can Accelerate Systems Performance Research

Add code
Dec 22, 2025
Viaarxiv icon

GSO: Challenging Software Optimization Tasks for Evaluating SWE-Agents

Add code
May 29, 2025
Viaarxiv icon

Type-Constrained Code Generation with Language Models

Add code
Apr 12, 2025
Figure 1 for Type-Constrained Code Generation with Language Models
Figure 2 for Type-Constrained Code Generation with Language Models
Figure 3 for Type-Constrained Code Generation with Language Models
Figure 4 for Type-Constrained Code Generation with Language Models
Viaarxiv icon

R2E-Gym: Procedural Environments and Hybrid Verifiers for Scaling Open-Weights SWE Agents

Add code
Apr 09, 2025
Viaarxiv icon

Challenges and Paths Towards AI for Software Engineering

Add code
Mar 28, 2025
Viaarxiv icon

LangProBe: a Language Programs Benchmark

Add code
Feb 27, 2025
Figure 1 for LangProBe: a Language Programs Benchmark
Figure 2 for LangProBe: a Language Programs Benchmark
Figure 3 for LangProBe: a Language Programs Benchmark
Figure 4 for LangProBe: a Language Programs Benchmark
Viaarxiv icon

Syzygy: Dual Code-Test C to (safe) Rust Translation using LLMs and Dynamic Analysis

Add code
Dec 18, 2024
Viaarxiv icon