Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:PromptClass: Weakly-Supervised Text Classification with Prompting Enhanced Noise-Robust Self-Training

May 23, 2023

Yunyi Zhang, Minhao Jiang, Yu Meng, Yu Zhang, Jiawei Han

Figure 1 for PromptClass: Weakly-Supervised Text Classification with Prompting Enhanced Noise-Robust Self-Training

Figure 2 for PromptClass: Weakly-Supervised Text Classification with Prompting Enhanced Noise-Robust Self-Training

Figure 3 for PromptClass: Weakly-Supervised Text Classification with Prompting Enhanced Noise-Robust Self-Training

Figure 4 for PromptClass: Weakly-Supervised Text Classification with Prompting Enhanced Noise-Robust Self-Training

Share this with someone who'll enjoy it:

Abstract:Recently proposed weakly-supervised text classification settings train a classifier using the label name of each target class as the only supervision. Such weakly-supervised settings have been gaining increasing attention since they can largely reduce human annotation efforts compared to fully-supervised and semi-supervised settings. Most existing methods follow the strategy that first uses the label names as static features to generate pseudo labels, which are then used for classifier training. While reasonable, such a commonly adopted framework suffers from two limitations: (1) words can have different meanings in different contexts, so using label names for context-free matching can induce very noisy pseudo labels; and (2) the errors made in the pseudo label generation stage will directly propagate to the classifier training stage without a chance of being corrected. In this paper, we propose a new method, PromptClass, consisting of two modules: (1) a pseudo label acquisition module that uses zero-shot prompting of pre-trained language models (PLM) to get pseudo labels based on contextualized text understanding, and (2) a noise-robust self-training module that iteratively trains the classifier and updates pseudo labels by utilizing two PLM fine-tuning strategies that regularize each other. Extensive experiments show that PromptClass achieves overall better performance than existing strong baselines on four benchmark datasets and even achieves similar performance to fully-supervised classifiers on sentiment classification tasks.

View paper on

Share this with someone who'll enjoy it:

Title:PromptClass: Weakly-Supervised Text Classification with Prompting Enhanced Noise-Robust Self-Training

Paper and Code