Abstract:This contribution describes a two-course module that seeks to provide humanities majors with a basic understanding of language technology and its applications using Python. The learning materials consist of interactive Jupyter Notebooks and accompanying YouTube videos, which are openly available with a Creative Commons licence.
Abstract:In this article, we bring together theories of multimodal communication and computational methods to study how primary school science diagrams combine multiple expressive resources. We position our work within the field of digital humanities, and show how annotations informed by multimodality research, which target expressive resources and discourse structure, allow imposing structure on the output of computational methods. We illustrate our approach by analysing two multimodal diagram corpora: the first corpus is intended to support research on automatic diagram processing, whereas the second is oriented towards studying diagrams as a mode of communication. Our results show that multimodally-informed annotations can bring out structural patterns in the diagrams, which also extend across diagrams that deal with different topics.
Abstract:In this article, we propose a multimodal perspective to diagrammatic representations by sketching a description of what may be tentatively termed the diagrammatic mode. We consider diagrammatic representations in the light of contemporary multimodality theory and explicate what enables diagrammatic representations to integrate natural language, various forms of graphics, diagrammatic elements such as arrows, lines and other expressive resources into coherent organisations. We illustrate the proposed approach using two recent diagram corpora and show how a multimodal approach supports the empirical analysis of diagrammatic representations, especially in identifying diagrammatic constituents and describing their interrelations.
Abstract:This article introduces AI2D-RST, a multimodal corpus of 1000 English-language diagrams that represent topics in primary school natural science, such as food webs, life cycles, moon phases and human physiology. The corpus is based on the Allen Institute for Artificial Intelligence Diagrams (AI2D) dataset, a collection of diagrams with crowd-sourced descriptions, which was originally developed for computational tasks such as automatic diagram understanding and visual question answering. Building on the segmentation of diagram layouts in AI2D, the AI2D-RST corpus presents a new multi-layer annotation schema that provides a rich description of their multimodal structure. Annotated by trained experts, the layers describe (1) the grouping of diagram elements into perceptual units, (2) the connections set up by diagrammatic elements such as arrows and lines, and (3) the discourse relations between diagram elements, which are described using Rhetorical Structure Theory (RST). Each annotation layer in AI2D-RST is represented using a graph. The corpus is freely available for research and teaching.
Abstract:This article compares two multimodal resources that consist of diagrams which describe topics in elementary school natural sciences. Both resources contain the same diagrams and represent their structure using graphs, but differ in terms of their annotation schema and how the annotations have been created - depending on the resource in question - either by crowd-sourced workers or trained experts. This article reports on two experiments that evaluate how effectively crowd-sourced and expert-annotated graphs can represent the multimodal structure of diagrams for representation learning using various graph neural networks. The results show that the identity of diagram elements can be learned from their layout features, while the expert annotations provide better representations of diagram types.