Picture for Chirag Agarwal

Chirag Agarwal

On the Impact of Fine-Tuning on Chain-of-Thought Reasoning

Add code
Nov 22, 2024
Viaarxiv icon

Towards Operationalizing Right to Data Protection

Add code
Nov 16, 2024
Viaarxiv icon

On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models

Add code
Jun 15, 2024
Viaarxiv icon

Towards Safe and Aligned Large Language Models for Medicine

Add code
Mar 06, 2024
Viaarxiv icon

Understanding the Effects of Iterative Prompting on Truthfulness

Add code
Feb 09, 2024
Viaarxiv icon

Faithfulness vs. Plausibility: On the (Un)Reliability of Explanations from Large Language Models

Add code
Feb 08, 2024
Viaarxiv icon

Quantifying Uncertainty in Natural Language Explanations of Large Language Models

Add code
Nov 06, 2023
Viaarxiv icon

Are Large Language Models Post Hoc Explainers?

Add code
Oct 10, 2023
Viaarxiv icon

On the Trade-offs between Adversarial Robustness and Actionable Explanations

Add code
Sep 28, 2023
Viaarxiv icon

Certifying LLM Safety against Adversarial Prompting

Add code
Sep 06, 2023
Figure 1 for Certifying LLM Safety against Adversarial Prompting
Figure 2 for Certifying LLM Safety against Adversarial Prompting
Figure 3 for Certifying LLM Safety against Adversarial Prompting
Figure 4 for Certifying LLM Safety against Adversarial Prompting
Viaarxiv icon