Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets

Jun 18, 2021

Irene Solaiman, Christy Dennison

Figure 1 for Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets

Figure 2 for Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets

Figure 3 for Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets

Figure 4 for Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets

Share this with someone who'll enjoy it:

Abstract:Language models can generate harmful and biased outputs and exhibit undesirable behavior. We propose a Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets, an iterative process to significantly change model behavior by crafting and fine-tuning on a dataset that reflects a predetermined set of target values. We evaluate our process using three metrics: quantitative metrics with human evaluations that score output adherence to a target value, and toxicity scoring on outputs; and qualitative metrics analyzing the most common word associated with a given social category. Through each iteration, we add additional training dataset examples based on observed shortcomings from evaluations. PALMS performs significantly better on all metrics compared to baseline and control models for a broad range of GPT-3 language model sizes without compromising capability integrity. We find that the effectiveness of PALMS increases with model size. We show that significantly adjusting language model behavior is feasible with a small, hand-curated dataset.

* Both authors contributed equally. Submitted to NeurIPS 2021

View paper on

OpenReview

Share this with someone who'll enjoy it:

Title:Process for Adapting Language Models to Society (PALMS) with Values-Targeted Datasets

Paper and Code