Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Nico Manzonelli

LLM Confidence Evaluation Measures in Zero-Shot CSS Classification

Oct 16, 2024

David Farr, Iain Cruickshank, Nico Manzonelli, Nicholas Clark, Kate Starbird, Jevin West

Figure 1 for LLM Confidence Evaluation Measures in Zero-Shot CSS Classification

Figure 2 for LLM Confidence Evaluation Measures in Zero-Shot CSS Classification

Abstract:Assessing classification confidence is critical for leveraging large language models (LLMs) in automated labeling tasks, especially in the sensitive domains presented by Computational Social Science (CSS) tasks. In this paper, we make three key contributions: (1) we propose an uncertainty quantification (UQ) performance measure tailored for data annotation tasks, (2) we compare, for the first time, five different UQ strategies across three distinct LLMs and CSS data annotation tasks, (3) we introduce a novel UQ aggregation strategy that effectively identifies low-confidence LLM annotations and disproportionately uncovers data incorrectly labeled by the LLMs. Our results demonstrate that our proposed UQ aggregation strategy improves upon existing methods andcan be used to significantly improve human-in-the-loop data annotation processes.

Via

Access Paper or Ask Questions

LLM Chain Ensembles for Scalable and Accurate Data Annotation

Oct 16, 2024

David Farr, Nico Manzonelli, Iain Cruickshank, Kate Starbird, Jevin West

Figure 1 for LLM Chain Ensembles for Scalable and Accurate Data Annotation

Figure 2 for LLM Chain Ensembles for Scalable and Accurate Data Annotation

Figure 3 for LLM Chain Ensembles for Scalable and Accurate Data Annotation

Figure 4 for LLM Chain Ensembles for Scalable and Accurate Data Annotation

Abstract:The ability of large language models (LLMs) to perform zero-shot classification makes them viable solutions for data annotation in rapidly evolving domains where quality labeled data is often scarce and costly to obtain. However, the large-scale deployment of LLMs can be prohibitively expensive. This paper introduces an LLM chain ensemble methodology that aligns multiple LLMs in a sequence, routing data subsets to subsequent models based on classification uncertainty. This approach leverages the strengths of individual LLMs within a broader system, allowing each model to handle data points where it exhibits the highest confidence, while forwarding more complex cases to potentially more robust models. Our results show that the chain ensemble method often exceeds the performance of the best individual model in the chain and achieves substantial cost savings, making LLM chain ensembles a practical and efficient solution for large-scale data annotation challenges.

Via

Access Paper or Ask Questions

RED-CT: A Systems Design Methodology for Using LLM-labeled Data to Train and Deploy Edge Classifiers for Computational Social Science

Aug 15, 2024

David Farr, Nico Manzonelli, Iain Cruickshank, Jevin West

Abstract:Large language models (LLMs) have enhanced our ability to rapidly analyze and classify unstructured natural language data. However, concerns regarding cost, network limitations, and security constraints have posed challenges for their integration into work processes. In this study, we adopt a systems design approach to employing LLMs as imperfect data annotators for downstream supervised learning tasks, introducing novel system intervention measures aimed at improving classification performance. Our methodology outperforms LLM-generated labels in seven of eight tests, demonstrating an effective strategy for incorporating LLMs into the design and deployment of specialized, supervised learning models present in many industry use cases.

Via

Access Paper or Ask Questions

Membership Inference Attacks and Privacy in Topic Modeling

Mar 07, 2024

Nico Manzonelli, Wanrong Zhang, Salil Vadhan

Abstract:Recent research shows that large language models are susceptible to privacy attacks that infer aspects of the training data. However, it is unclear if simpler generative models, like topic models, share similar vulnerabilities. In this work, we propose an attack against topic models that can confidently identify members of the training data in Latent Dirichlet Allocation. Our results suggest that the privacy risks associated with generative modeling are not restricted to large neural models. Additionally, to mitigate these vulnerabilities, we explore differentially private (DP) topic modeling. We propose a framework for private topic modeling that incorporates DP vocabulary selection as a pre-processing step, and show that it improves privacy while having limited effects on practical utility.

* 9 pages + appendices and references. 9 figures. Submitted to USENIX '24

Via

Access Paper or Ask Questions