Picture for Mislav Balunović

Mislav Balunović

MathArena: Evaluating LLMs on Uncontaminated Math Competitions

Add code
May 29, 2025
Figure 1 for MathArena: Evaluating LLMs on Uncontaminated Math Competitions
Figure 2 for MathArena: Evaluating LLMs on Uncontaminated Math Competitions
Figure 3 for MathArena: Evaluating LLMs on Uncontaminated Math Competitions
Figure 4 for MathArena: Evaluating LLMs on Uncontaminated Math Competitions
Viaarxiv icon

Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad

Add code
Mar 27, 2025
Figure 1 for Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Figure 2 for Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Figure 3 for Proof or Bluff? Evaluating LLMs on 2025 USA Math Olympiad
Viaarxiv icon

ToolFuzz -- Automated Agent Tool Testing

Add code
Mar 06, 2025
Figure 1 for ToolFuzz -- Automated Agent Tool Testing
Figure 2 for ToolFuzz -- Automated Agent Tool Testing
Figure 3 for ToolFuzz -- Automated Agent Tool Testing
Figure 4 for ToolFuzz -- Automated Agent Tool Testing
Viaarxiv icon

COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act

Add code
Oct 10, 2024
Figure 1 for COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Figure 2 for COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Figure 3 for COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Figure 4 for COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Viaarxiv icon

AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents

Add code
Jun 19, 2024
Figure 1 for AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Figure 2 for AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Figure 3 for AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Figure 4 for AgentDojo: A Dynamic Environment to Evaluate Attacks and Defenses for LLM Agents
Viaarxiv icon

Large Language Models are Advanced Anonymizers

Add code
Feb 21, 2024
Figure 1 for Large Language Models are Advanced Anonymizers
Figure 2 for Large Language Models are Advanced Anonymizers
Figure 3 for Large Language Models are Advanced Anonymizers
Figure 4 for Large Language Models are Advanced Anonymizers
Viaarxiv icon

From Principle to Practice: Vertical Data Minimization for Machine Learning

Add code
Nov 22, 2023
Figure 1 for From Principle to Practice: Vertical Data Minimization for Machine Learning
Figure 2 for From Principle to Practice: Vertical Data Minimization for Machine Learning
Figure 3 for From Principle to Practice: Vertical Data Minimization for Machine Learning
Figure 4 for From Principle to Practice: Vertical Data Minimization for Machine Learning
Viaarxiv icon

Beyond Memorization: Violating Privacy Via Inference with Large Language Models

Add code
Oct 11, 2023
Figure 1 for Beyond Memorization: Violating Privacy Via Inference with Large Language Models
Figure 2 for Beyond Memorization: Violating Privacy Via Inference with Large Language Models
Figure 3 for Beyond Memorization: Violating Privacy Via Inference with Large Language Models
Figure 4 for Beyond Memorization: Violating Privacy Via Inference with Large Language Models
Viaarxiv icon

Programmable Synthetic Tabular Data Generation

Add code
Jul 10, 2023
Figure 1 for Programmable Synthetic Tabular Data Generation
Figure 2 for Programmable Synthetic Tabular Data Generation
Figure 3 for Programmable Synthetic Tabular Data Generation
Figure 4 for Programmable Synthetic Tabular Data Generation
Viaarxiv icon

FARE: Provably Fair Representation Learning

Add code
Oct 13, 2022
Figure 1 for FARE: Provably Fair Representation Learning
Figure 2 for FARE: Provably Fair Representation Learning
Figure 3 for FARE: Provably Fair Representation Learning
Figure 4 for FARE: Provably Fair Representation Learning
Viaarxiv icon