Picture for Palash Goyal

Palash Goyal

Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models

Add code
Oct 07, 2024
Figure 1 for Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
Figure 2 for Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
Figure 3 for Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
Figure 4 for Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models
Viaarxiv icon

Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification

Add code
Oct 07, 2024
Figure 1 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Figure 2 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Figure 3 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Figure 4 for Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification
Viaarxiv icon

Are you talking to or ? On Tokenization and Addressing Misgendering in LLMs with Pronoun Tokenization Parity

Add code
Dec 21, 2023
Viaarxiv icon

Faithful Model Evaluation for Model-Based Metrics

Add code
Dec 19, 2023
Viaarxiv icon

JAB: Joint Adversarial Prompting and Belief Augmentation

Add code
Nov 16, 2023
Viaarxiv icon

On the steerability of large language models toward data-driven personas

Add code
Nov 08, 2023
Viaarxiv icon

FLIRT: Feedback Loop In-context Red Teaming

Add code
Aug 08, 2023
Figure 1 for FLIRT: Feedback Loop In-context Red Teaming
Figure 2 for FLIRT: Feedback Loop In-context Red Teaming
Figure 3 for FLIRT: Feedback Loop In-context Red Teaming
Figure 4 for FLIRT: Feedback Loop In-context Red Teaming
Viaarxiv icon

"I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation

Add code
May 18, 2023
Figure 1 for "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation
Figure 2 for "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation
Figure 3 for "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation
Figure 4 for "I'm fully who I am": Towards Centering Transgender and Non-Binary Voices to Measure Biases in Open Language Generation
Viaarxiv icon

Is the Elephant Flying? Resolving Ambiguities in Text-to-Image Generative Models

Add code
Nov 17, 2022
Viaarxiv icon

Leveraging Local Temporal Information for Multimodal Scene Classification

Add code
Oct 26, 2021
Figure 1 for Leveraging Local Temporal Information for Multimodal Scene Classification
Figure 2 for Leveraging Local Temporal Information for Multimodal Scene Classification
Figure 3 for Leveraging Local Temporal Information for Multimodal Scene Classification
Figure 4 for Leveraging Local Temporal Information for Multimodal Scene Classification
Viaarxiv icon