Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Rajesh Prabhakar

STONYBOOK: A System and Resource for Large-Scale Analysis of Novels

Nov 06, 2023

Charuta Pethe, Allen Kim, Rajesh Prabhakar, Tanzir Pial, Steven Skiena

Figure 1 for STONYBOOK: A System and Resource for Large-Scale Analysis of Novels

Figure 2 for STONYBOOK: A System and Resource for Large-Scale Analysis of Novels

Figure 3 for STONYBOOK: A System and Resource for Large-Scale Analysis of Novels

Figure 4 for STONYBOOK: A System and Resource for Large-Scale Analysis of Novels

Abstract:Books have historically been the primary mechanism through which narratives are transmitted. We have developed a collection of resources for the large-scale analysis of novels, including: (1) an open source end-to-end NLP analysis pipeline for the annotation of novels into a standard XML format, (2) a collection of 49,207 distinct cleaned and annotated novels, and (3) a database with an associated web interface for the large-scale aggregate analysis of these literary works. We describe the major functionalities provided in the annotation system along with their utilities. We present samples of analysis artifacts from our website, such as visualizations of character occurrences and interactions, similar books, representative vocabulary, part of speech statistics, and readability metrics. We also describe the use of the annotated format in qualitative and quantitative analysis across large corpora of novels.

* 8 pages, 12 figures

Via

Access Paper or Ask Questions