Picture for Orna Raz

Orna Raz

Statistical multi-metric evaluation and visualization of LLM system predictive performance

Add code
Jan 30, 2025
Viaarxiv icon

Can You Trust Your Metric? Automatic Concatenation-Based Tests for Metric Validity

Add code
Aug 22, 2024
Figure 1 for Can You Trust Your Metric? Automatic Concatenation-Based Tests for Metric Validity
Figure 2 for Can You Trust Your Metric? Automatic Concatenation-Based Tests for Metric Validity
Figure 3 for Can You Trust Your Metric? Automatic Concatenation-Based Tests for Metric Validity
Figure 4 for Can You Trust Your Metric? Automatic Concatenation-Based Tests for Metric Validity
Viaarxiv icon

Generating Unseen Code Tests In Infinitum

Add code
Jul 29, 2024
Figure 1 for Generating Unseen Code Tests In Infinitum
Figure 2 for Generating Unseen Code Tests In Infinitum
Figure 3 for Generating Unseen Code Tests In Infinitum
Figure 4 for Generating Unseen Code Tests In Infinitum
Viaarxiv icon

Using Combinatorial Optimization to Design a High quality LLM Solution

Add code
May 15, 2024
Viaarxiv icon

Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations

Add code
Mar 09, 2024
Figure 1 for Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
Figure 2 for Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
Figure 3 for Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
Figure 4 for Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
Viaarxiv icon

Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

Add code
Mar 08, 2024
Viaarxiv icon

Unveiling Safety Vulnerabilities of Large Language Models

Add code
Nov 07, 2023
Figure 1 for Unveiling Safety Vulnerabilities of Large Language Models
Figure 2 for Unveiling Safety Vulnerabilities of Large Language Models
Figure 3 for Unveiling Safety Vulnerabilities of Large Language Models
Figure 4 for Unveiling Safety Vulnerabilities of Large Language Models
Viaarxiv icon

Predicting Question-Answering Performance of Large Language Models through Semantic Consistency

Add code
Nov 02, 2023
Figure 1 for Predicting Question-Answering Performance of Large Language Models through Semantic Consistency
Figure 2 for Predicting Question-Answering Performance of Large Language Models through Semantic Consistency
Figure 3 for Predicting Question-Answering Performance of Large Language Models through Semantic Consistency
Figure 4 for Predicting Question-Answering Performance of Large Language Models through Semantic Consistency
Viaarxiv icon

Automatic Generation of Attention Rules For Containment of Machine Learning Model Errors

Add code
May 14, 2023
Figure 1 for Automatic Generation of Attention Rules For Containment of Machine Learning Model Errors
Figure 2 for Automatic Generation of Attention Rules For Containment of Machine Learning Model Errors
Figure 3 for Automatic Generation of Attention Rules For Containment of Machine Learning Model Errors
Figure 4 for Automatic Generation of Attention Rules For Containment of Machine Learning Model Errors
Viaarxiv icon

Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora

Add code
Nov 29, 2022
Figure 1 for Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora
Figure 2 for Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora
Figure 3 for Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora
Figure 4 for Measuring the Measuring Tools: An Automatic Evaluation of Semantic Metrics for Text Corpora
Viaarxiv icon