Picture for Somak Aditya

Somak Aditya

ERVQA: A Dataset to Benchmark the Readiness of Large Vision Language Models in Hospital Environments

Add code
Oct 08, 2024
Viaarxiv icon

Jailbreak Paradox: The Achilles' Heel of LLMs

Add code
Jun 18, 2024
Viaarxiv icon

MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning

Add code
Feb 27, 2024
Viaarxiv icon

GRAFFORD: A Benchmark Dataset for Testing the Knowledge of Object Affordances of Language and Vision Models

Add code
Feb 20, 2024
Viaarxiv icon

Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs

Add code
Jan 18, 2024
Viaarxiv icon

Stuck in the Quicksand of Numeracy, Far from AGI Summit: Evaluating LLMs' Mathematical Competency through Ontology-guided Perturbations

Add code
Jan 17, 2024
Viaarxiv icon

Towards LogiGLUE: A Brief Survey and A Benchmark for Analyzing Logical Reasoning Capabilities of Language Models

Add code
Oct 02, 2023
Viaarxiv icon

Tricking LLMs into Disobedience: Understanding, Analyzing, and Preventing Jailbreaks

Add code
May 24, 2023
Viaarxiv icon

ReMask: A Robust Information-Masking Approach for Domain Counterfactual Generation

Add code
May 04, 2023
Viaarxiv icon

Generating Intermediate Steps for NLI with Next-Step Supervision

Add code
Aug 31, 2022
Figure 1 for Generating Intermediate Steps for NLI with Next-Step Supervision
Figure 2 for Generating Intermediate Steps for NLI with Next-Step Supervision
Figure 3 for Generating Intermediate Steps for NLI with Next-Step Supervision
Figure 4 for Generating Intermediate Steps for NLI with Next-Step Supervision
Viaarxiv icon