Picture for Sameer Singh

Sameer Singh

Benchmark Data Repositories for Better Benchmarking

Add code
Oct 31, 2024
Viaarxiv icon

TurtleBench: A Visual Programming Benchmark in Turtle Geometry

Add code
Oct 31, 2024
Figure 1 for TurtleBench: A Visual Programming Benchmark in Turtle Geometry
Figure 2 for TurtleBench: A Visual Programming Benchmark in Turtle Geometry
Figure 3 for TurtleBench: A Visual Programming Benchmark in Turtle Geometry
Figure 4 for TurtleBench: A Visual Programming Benchmark in Turtle Geometry
Viaarxiv icon

Nudging: Inference-time Alignment via Model Collaboration

Add code
Oct 15, 2024
Viaarxiv icon

Perceptions of Linguistic Uncertainty by Language Models and Humans

Add code
Jul 22, 2024
Viaarxiv icon

Are Models Biased on Text without Gender-related Language?

Add code
May 01, 2024
Viaarxiv icon

MisgenderMender: A Community-Informed Approach to Interventions for Misgendering

Add code
Apr 23, 2024
Viaarxiv icon

Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills

Add code
Feb 05, 2024
Viaarxiv icon

Performance Trade-offs of Watermarking Large Language Models

Add code
Nov 16, 2023
Viaarxiv icon

Measuring and Improving Attentiveness to Partial Inputs with Counterfactuals

Add code
Nov 16, 2023
Viaarxiv icon

What's In My Big Data?

Add code
Oct 31, 2023
Figure 1 for What's In My Big Data?
Figure 2 for What's In My Big Data?
Figure 3 for What's In My Big Data?
Figure 4 for What's In My Big Data?
Viaarxiv icon