Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Satya Sai Srinath Namburi

Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

Apr 24, 2024

Harit Vishwakarma, Reid, Chen, Sui Jiet Tay, Satya Sai Srinath Namburi, Frederic Sala, Ramya Korlakai Vinayak

Figure 1 for Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

Figure 2 for Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

Figure 3 for Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

Figure 4 for Pearls from Pebbles: Improved Confidence Functions for Auto-labeling

Abstract:Auto-labeling is an important family of techniques that produce labeled training sets with minimum manual labeling. A prominent variant, threshold-based auto-labeling (TBAL), works by finding a threshold on a model's confidence scores above which it can accurately label unlabeled data points. However, many models are known to produce overconfident scores, leading to poor TBAL performance. While a natural idea is to apply off-the-shelf calibration methods to alleviate the overconfidence issue, such methods still fall short. Rather than experimenting with ad-hoc choices of confidence functions, we propose a framework for studying the \emph{optimal} TBAL confidence function. We develop a tractable version of the framework to obtain \texttt{Colander} (Confidence functions for Efficient and Reliable Auto-labeling), a new post-hoc method specifically designed to maximize performance in TBAL systems. We perform an extensive empirical evaluation of our method \texttt{Colander} and compare it against methods designed for calibration. \texttt{Colander} achieves up to 60\% improvements on coverage over the baselines while maintaining auto-labeling error below $5\%$ and using the same amount of labeled data as the baselines.

Via

Access Paper or Ask Questions

The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models

Dec 01, 2023

Satya Sai Srinath Namburi, Makesh Sreedhar, Srinath Srinivasan, Frederic Sala

Figure 1 for The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models

Figure 2 for The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models

Figure 3 for The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models

Figure 4 for The Cost of Compression: Investigating the Impact of Compression on Parametric Knowledge in Language Models

Abstract:Compressing large language models (LLMs), often consisting of billions of parameters, provides faster inference, smaller memory footprints, and enables local deployment. Two standard compression techniques are pruning and quantization, with the former eliminating redundant connections in model layers and the latter representing model parameters with fewer bits. The key tradeoff is between the degree of compression and the impact on the quality of the compressed model. Existing research on LLM compression primarily focuses on performance in terms of general metrics like perplexity or downstream task accuracy. More fine-grained metrics, such as those measuring parametric knowledge, remain significantly underexplored. To help bridge this gap, we present a comprehensive analysis across multiple model families (ENCODER, ENCODER-DECODER, and DECODER) using the LAMA and LM-HARNESS benchmarks in order to systematically quantify the effect of commonly employed compression techniques on model performance. A particular focus is on tradeoffs involving parametric knowledge, with the goal of providing practitioners with practical insights to help make informed decisions on compression. We release our codebase1 to enable further research.

* Accepted to EMNLP 2023 Findings

Via

Access Paper or Ask Questions