Picture for Ridhi Jain

Ridhi Jain

Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence

Add code
Oct 20, 2024
Figure 1 for Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
Figure 2 for Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
Figure 3 for Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
Figure 4 for Dynamic Intelligence Assessment: Benchmarking LLMs on the Road to AGI with a Focus on Model Confidence
Viaarxiv icon

Do Neutral Prompts Produce Insecure Code? FormAI-v2 Dataset: Labelling Vulnerabilities in Code Generated by Large Language Models

Add code
Apr 29, 2024
Viaarxiv icon

CyberMetric: A Benchmark Dataset for Evaluating Large Language Models Knowledge in Cybersecurity

Add code
Feb 12, 2024
Viaarxiv icon

The FormAI Dataset: Generative AI in Software Security Through the Lens of Formal Verification

Add code
Jul 05, 2023
Viaarxiv icon

A New Era in Software Security: Towards Self-Healing Software via Large Language Models and Formal Verification

Add code
May 24, 2023
Viaarxiv icon