Picture for Soham Dan

Soham Dan

Benchmarking LLM Guardrails in Handling Multilingual Toxicity

Add code
Oct 29, 2024
Viaarxiv icon

Large Language Models can be Strong Self-Detoxifiers

Add code
Oct 04, 2024
Viaarxiv icon

Multilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Models

Add code
Aug 19, 2024
Viaarxiv icon

Needle in the Haystack for Memory Based Large Language Models

Add code
Jul 01, 2024
Viaarxiv icon

AssertionBench: A Benchmark to Evaluate Large-Language Models for Assertion Generation

Add code
Jun 26, 2024
Viaarxiv icon

CTBench: A Comprehensive Benchmark for Evaluating Language Model Capabilities in Clinical Trial Design

Add code
Jun 25, 2024
Viaarxiv icon

On the Utility of Domain-Adjacent Fine-Tuned Model Ensembles for Few-shot Problems

Add code
Jun 19, 2024
Viaarxiv icon

On the Robustness of Language Models for Tabular Question Answering

Add code
Jun 18, 2024
Viaarxiv icon

Large Language Model Confidence Estimation via Black-Box Access

Add code
Jun 01, 2024
Viaarxiv icon

On the Effects of Fine-tuning Language Models for Text-Based Reinforcement Learning

Add code
Apr 15, 2024
Viaarxiv icon