Picture for Micah Goldblum

Micah Goldblum

Commercial LLM Agents Are Already Vulnerable to Simple Yet Dangerous Attacks

Add code
Feb 12, 2025
Viaarxiv icon

Gemstones: A Model Suite for Multi-Faceted Scaling Laws

Add code
Feb 07, 2025
Figure 1 for Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Figure 2 for Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Figure 3 for Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Figure 4 for Gemstones: A Model Suite for Multi-Faceted Scaling Laws
Viaarxiv icon

Transformers Boost the Performance of Decision Trees on Tabular Data across Sample Sizes

Add code
Feb 06, 2025
Viaarxiv icon

Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models

Add code
Dec 09, 2024
Figure 1 for Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models
Figure 2 for Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models
Figure 3 for Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models
Figure 4 for Refusal Tokens: A Simple Way to Calibrate Refusals in Large Language Models
Viaarxiv icon

vTune: Verifiable Fine-Tuning for LLMs Through Backdooring

Add code
Nov 12, 2024
Viaarxiv icon

A Simple Baseline for Predicting Events with Auto-Regressive Tabular Transformers

Add code
Oct 14, 2024
Viaarxiv icon

Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices

Add code
Oct 03, 2024
Figure 1 for Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices
Figure 2 for Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices
Figure 3 for Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices
Figure 4 for Searching for Efficient Linear Layers over a Continuous Space of Structured Matrices
Viaarxiv icon

Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models

Add code
Jul 25, 2024
Viaarxiv icon

LiveBench: A Challenging, Contamination-Free LLM Benchmark

Add code
Jun 27, 2024
Figure 1 for LiveBench: A Challenging, Contamination-Free LLM Benchmark
Figure 2 for LiveBench: A Challenging, Contamination-Free LLM Benchmark
Figure 3 for LiveBench: A Challenging, Contamination-Free LLM Benchmark
Figure 4 for LiveBench: A Challenging, Contamination-Free LLM Benchmark
Viaarxiv icon

Just How Flexible are Neural Networks in Practice?

Add code
Jun 17, 2024
Viaarxiv icon