Picture for Avi Schwarzschild

Avi Schwarzschild

Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization

Add code
Sep 27, 2024
Figure 1 for Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization
Figure 2 for Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization
Figure 3 for Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization
Figure 4 for Easy2Hard-Bench: Standardized Difficulty Labels for Profiling LLM Performance and Generalization
Viaarxiv icon

Prompt Recovery for Image Generation Models: A Comparative Study of Discrete Optimizers

Add code
Aug 12, 2024
Viaarxiv icon

The CLRS-Text Algorithmic Reasoning Language Benchmark

Add code
Jun 06, 2024
Figure 1 for The CLRS-Text Algorithmic Reasoning Language Benchmark
Figure 2 for The CLRS-Text Algorithmic Reasoning Language Benchmark
Figure 3 for The CLRS-Text Algorithmic Reasoning Language Benchmark
Figure 4 for The CLRS-Text Algorithmic Reasoning Language Benchmark
Viaarxiv icon

Transformers Can Do Arithmetic with the Right Embeddings

Add code
May 27, 2024
Figure 1 for Transformers Can Do Arithmetic with the Right Embeddings
Figure 2 for Transformers Can Do Arithmetic with the Right Embeddings
Figure 3 for Transformers Can Do Arithmetic with the Right Embeddings
Figure 4 for Transformers Can Do Arithmetic with the Right Embeddings
Viaarxiv icon

Rethinking LLM Memorization through the Lens of Adversarial Compression

Add code
Apr 23, 2024
Figure 1 for Rethinking LLM Memorization through the Lens of Adversarial Compression
Figure 2 for Rethinking LLM Memorization through the Lens of Adversarial Compression
Figure 3 for Rethinking LLM Memorization through the Lens of Adversarial Compression
Figure 4 for Rethinking LLM Memorization through the Lens of Adversarial Compression
Viaarxiv icon

Forcing Diffuse Distributions out of Language Models

Add code
Apr 16, 2024
Figure 1 for Forcing Diffuse Distributions out of Language Models
Figure 2 for Forcing Diffuse Distributions out of Language Models
Figure 3 for Forcing Diffuse Distributions out of Language Models
Figure 4 for Forcing Diffuse Distributions out of Language Models
Viaarxiv icon

Benchmarking ChatGPT on Algorithmic Reasoning

Add code
Apr 04, 2024
Figure 1 for Benchmarking ChatGPT on Algorithmic Reasoning
Figure 2 for Benchmarking ChatGPT on Algorithmic Reasoning
Figure 3 for Benchmarking ChatGPT on Algorithmic Reasoning
Figure 4 for Benchmarking ChatGPT on Algorithmic Reasoning
Viaarxiv icon

Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text

Add code
Jan 22, 2024
Figure 1 for Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Figure 2 for Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Figure 3 for Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Figure 4 for Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Viaarxiv icon

TOFU: A Task of Fictitious Unlearning for LLMs

Add code
Jan 11, 2024
Viaarxiv icon

Effective Backdoor Mitigation Depends on the Pre-training Objective

Add code
Dec 05, 2023
Figure 1 for Effective Backdoor Mitigation Depends on the Pre-training Objective
Figure 2 for Effective Backdoor Mitigation Depends on the Pre-training Objective
Figure 3 for Effective Backdoor Mitigation Depends on the Pre-training Objective
Figure 4 for Effective Backdoor Mitigation Depends on the Pre-training Objective
Viaarxiv icon