Picture for Martin Vechev

Martin Vechev

A Unified Approach to Routing and Cascading for LLMs

Add code
Oct 14, 2024
Viaarxiv icon

COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act

Add code
Oct 10, 2024
Figure 1 for COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Figure 2 for COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Figure 3 for COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Figure 4 for COMPL-AI Framework: A Technical Interpretation and LLM Benchmarking Suite for the EU Artificial Intelligence Act
Viaarxiv icon

Average Certified Radius is a Poor Metric for Randomized Smoothing

Add code
Oct 09, 2024
Viaarxiv icon

Multi-Neuron Unleashes Expressivity of ReLU Networks Under Convex Relaxation

Add code
Oct 09, 2024
Viaarxiv icon

Ward: Provable RAG Dataset Inference via LLM Watermarks

Add code
Oct 04, 2024
Viaarxiv icon

Discovering Clues of Spoofed LM Watermarks

Add code
Oct 03, 2024
Figure 1 for Discovering Clues of Spoofed LM Watermarks
Figure 2 for Discovering Clues of Spoofed LM Watermarks
Figure 3 for Discovering Clues of Spoofed LM Watermarks
Figure 4 for Discovering Clues of Spoofed LM Watermarks
Viaarxiv icon

AlphaIntegrator: Transformer Action Search for Symbolic Integration Proofs

Add code
Oct 03, 2024
Viaarxiv icon

Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation

Add code
Sep 01, 2024
Figure 1 for Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation
Figure 2 for Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation
Figure 3 for Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation
Figure 4 for Polyrating: A Cost-Effective and Bias-Aware Rating System for LLM Evaluation
Viaarxiv icon

Practical Attacks against Black-box Code Completion Engines

Add code
Aug 05, 2024
Viaarxiv icon

Code Agents are State of the Art Software Testers

Add code
Jun 18, 2024
Viaarxiv icon