Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Michael Wenger

LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

Jun 23, 2023

Robert Chew, John Bollenbacher, Michael Wenger, Jessica Speer, Annice Kim

Figure 1 for LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

Figure 2 for LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

Figure 3 for LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

Figure 4 for LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

Abstract:Deductive coding is a widely used qualitative research method for determining the prevalence of themes across documents. While useful, deductive coding is often burdensome and time consuming since it requires researchers to read, interpret, and reliably categorize a large body of unstructured text documents. Large language models (LLMs), like ChatGPT, are a class of quickly evolving AI tools that can perform a range of natural language processing and reasoning tasks. In this study, we explore the use of LLMs to reduce the time it takes for deductive coding while retaining the flexibility of a traditional content analysis. We outline the proposed approach, called LLM-assisted content analysis (LACA), along with an in-depth case study using GPT-3.5 for LACA on a publicly available deductive coding data set. Additionally, we conduct an empirical benchmark using LACA on 4 publicly available data sets to assess the broader question of how well GPT-3.5 performs across a range of deductive coding tasks. Overall, we find that GPT-3.5 can often perform deductive coding at levels of agreement comparable to human coders. Additionally, we demonstrate that LACA can help refine prompts for deductive coding, identify codes for which an LLM is randomly guessing, and help assess when to use LLMs vs. human coders for deductive coding. We conclude with several implications for future practice of deductive coding and related research methods.

Via

Access Paper or Ask Questions

SMART: An Open Source Data Labeling Platform for Supervised Learning

Dec 11, 2018

Rob Chew, Michael Wenger, Caroline Kery, Jason Nance, Keith Richards, Emily Hadley, Peter Baumgartner

Figure 1 for SMART: An Open Source Data Labeling Platform for Supervised Learning

Abstract:SMART is an open source web application designed to help data scientists and research teams efficiently build labeled training data sets for supervised machine learning tasks. SMART provides users with an intuitive interface for creating labeled data sets, supports active learning to help reduce the required amount of labeled data, and incorporates inter-rater reliability statistics to provide insight into label quality. SMART is designed to be platform agnostic and easily deployable to meet the needs of as many different research teams as possible. The project website contains links to the code repository and extensive user documentation.

* 5 pages, 1 figure

Via

Access Paper or Ask Questions