Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yiheng Su

Using Large Language Models to Assist Video Content Analysis: An Exploratory Study of Short Videos on Depression

Jun 27, 2024

Jiaying Liu, Yunlong Wang, Yao Lyu, Yiheng Su, Shuo Niu, Xuhai "Orson" Xu, Yan Zhang

Figure 1 for Using Large Language Models to Assist Video Content Analysis: An Exploratory Study of Short Videos on Depression

Figure 2 for Using Large Language Models to Assist Video Content Analysis: An Exploratory Study of Short Videos on Depression

Figure 3 for Using Large Language Models to Assist Video Content Analysis: An Exploratory Study of Short Videos on Depression

Abstract:Despite the growing interest in leveraging Large Language Models (LLMs) for content analysis, current studies have primarily focused on text-based content. In the present work, we explored the potential of LLMs in assisting video content analysis by conducting a case study that followed a new workflow of LLM-assisted multimodal content analysis. The workflow encompasses codebook design, prompt engineering, LLM processing, and human evaluation. We strategically crafted annotation prompts to get LLM Annotations in structured form and explanation prompts to generate LLM Explanations for a better understanding of LLM reasoning and transparency. To test LLM's video annotation capabilities, we analyzed 203 keyframes extracted from 25 YouTube short videos about depression. We compared the LLM Annotations with those of two human coders and found that LLM has higher accuracy in object and activity Annotations than emotion and genre Annotations. Moreover, we identified the potential and limitations of LLM's capabilities in annotating videos. Based on the findings, we explore opportunities and challenges for future research and improvements to the workflow. We also discuss ethical concerns surrounding future studies based on LLM-assisted video analysis.

* 6 pages, 2 figures, under review in CSCW 24

Via

Access Paper or Ask Questions

Interpretable by Design: Wrapper Boxes Combine Neural Performance with Faithful Explanations

Nov 15, 2023

Yiheng Su, Juni Jessy Li, Matthew Lease

Abstract:Can we preserve the accuracy of neural models while also providing faithful explanations? We present wrapper boxes, a general approach to generate faithful, example-based explanations for model predictions while maintaining predictive performance. After training a neural model as usual, its learned feature representation is input to a classic, interpretable model to perform the actual prediction. This simple strategy is surprisingly effective, with results largely comparable to those of the original neural model, as shown across three large pre-trained language models, two datasets of varying scale, four classic models, and four evaluation metrics. Moreover, because these classic models are interpretable by design, the subset of training examples that determine classic model predictions can be shown directly to users.

Via

Access Paper or Ask Questions