Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Swapnil Hingmire

A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph Generation

Jan 07, 2022

Irene Li, Thomas George, Alexander Fabbri, Tammy Liao, Benjamin Chen, Rina Kawamura, Richard Zhou, Vanessa Yan, Swapnil Hingmire, Dragomir Radev

Figure 1 for A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph Generation

Figure 2 for A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph Generation

Figure 3 for A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph Generation

Figure 4 for A Transfer Learning Pipeline for Educational Resource Discovery with Application in Leading Paragraph Generation

Abstract:Effective human learning depends on a wide selection of educational materials that align with the learner's current understanding of the topic. While the Internet has revolutionized human learning or education, a substantial resource accessibility barrier still exists. Namely, the excess of online information can make it challenging to navigate and discover high-quality learning materials. In this paper, we propose the educational resource discovery (ERD) pipeline that automates web resource discovery for novel domains. The pipeline consists of three main steps: data collection, feature extraction, and resource classification. We start with a known source domain and conduct resource discovery on two unseen target domains via transfer learning. We first collect frequent queries from a set of seed documents and search on the web to obtain candidate resources, such as lecture slides and introductory blog posts. Then we introduce a novel pretrained information retrieval deep neural network model, query-document masked language modeling (QD-MLM), to extract deep features of these candidate resources. We apply a tree-based classifier to decide whether the candidate is a positive learning resource. The pipeline achieves F1 scores of 0.94 and 0.82 when evaluated on two similar but novel target domains. Finally, we demonstrate how this pipeline can benefit an application: leading paragraph generation for surveys. This is the first study that considers various web resources for survey generation, to the best of our knowledge. We also release a corpus of 39,728 manually labeled web resources and 659 queries from NLP, Computer Vision (CV), and Statistics (STATS).

Via

Access Paper or Ask Questions

CLICKER: A Computational LInguistics Classification Scheme for Educational Resources

Dec 16, 2021

Swapnil Hingmire, Irene Li, Rena Kawamura, Benjamin Chen, Alexander Fabbri, Xiangru Tang, Yixin Liu, Thomas George, Tammy Liao, Wai Pan Wong(+4 more)

Figure 1 for CLICKER: A Computational LInguistics Classification Scheme for Educational Resources

Figure 2 for CLICKER: A Computational LInguistics Classification Scheme for Educational Resources

Figure 3 for CLICKER: A Computational LInguistics Classification Scheme for Educational Resources

Figure 4 for CLICKER: A Computational LInguistics Classification Scheme for Educational Resources

Abstract:A classification scheme of a scientific subject gives an overview of its body of knowledge. It can also be used to facilitate access to research articles and other materials related to the subject. For example, the ACM Computing Classification System (CCS) is used in the ACM Digital Library search interface and also for indexing computer science papers. We observed that a comprehensive classification system like CCS or Mathematics Subject Classification (MSC) does not exist for Computational Linguistics (CL) and Natural Language Processing (NLP). We propose a classification scheme -- CLICKER for CL/NLP based on the analysis of online lectures from 77 university courses on this subject. The currently proposed taxonomy includes 334 topics and focuses on educational aspects of CL/NLP; it is based primarily, but not exclusively, on lecture notes from NLP courses. We discuss how such a taxonomy can help in various real-world applications, including tutoring platforms, resource retrieval, resource recommendation, prerequisite chain learning, and survey generation.

* 7 pages, 5 figures, 4 tables

Via

Access Paper or Ask Questions

R-VGAE: Relational-variational Graph Autoencoder for Unsupervised Prerequisite Chain Learning

Apr 22, 2020

Irene Li, Alexander Fabbri, Swapnil Hingmire, Dragomir Radev

Figure 1 for R-VGAE: Relational-variational Graph Autoencoder for Unsupervised Prerequisite Chain Learning

Figure 2 for R-VGAE: Relational-variational Graph Autoencoder for Unsupervised Prerequisite Chain Learning

Figure 3 for R-VGAE: Relational-variational Graph Autoencoder for Unsupervised Prerequisite Chain Learning

Figure 4 for R-VGAE: Relational-variational Graph Autoencoder for Unsupervised Prerequisite Chain Learning

Abstract:The task of concept prerequisite chain learning is to automatically determine the existence of prerequisite relationships among concept pairs. In this paper, we frame learning prerequisite relationships among concepts as an unsupervised task with no access to labeled concept pairs during training. We propose a model called the Relational-Variational Graph AutoEncoder (R-VGAE) to predict concept relations within a graph consisting of concept and resource nodes. Results show that our unsupervised approach outperforms graph-based semi-supervised methods and other baseline methods by up to 9.77% and 10.47% in terms of prerequisite relation prediction accuracy and F1 score. Our method is notably the first graph-based model that attempts to make use of deep learning representations for the task of unsupervised prerequisite learning. We also expand an existing corpus which totals 1,717 English Natural Language Processing (NLP)-related lecture slide files and manual concept pair annotations over 322 topics.

* 2 Figures, 3 Tables, 9 Pages

Via

Access Paper or Ask Questions

Topics and Label Propagation: Best of Both Worlds for Weakly Supervised Text Classification

Dec 04, 2017

Sachin Pawar, Nitin Ramrakhiyani, Swapnil Hingmire, Girish K. Palshikar

Figure 1 for Topics and Label Propagation: Best of Both Worlds for Weakly Supervised Text Classification

Figure 2 for Topics and Label Propagation: Best of Both Worlds for Weakly Supervised Text Classification

Figure 3 for Topics and Label Propagation: Best of Both Worlds for Weakly Supervised Text Classification

Figure 4 for Topics and Label Propagation: Best of Both Worlds for Weakly Supervised Text Classification

Abstract:We propose a Label Propagation based algorithm for weakly supervised text classification. We construct a graph where each document is represented by a node and edge weights represent similarities among the documents. Additionally, we discover underlying topics using Latent Dirichlet Allocation (LDA) and enrich the document graph by including the topics in the form of additional nodes. The edge weights between a topic and a text document represent level of "affinity" between them. Our approach does not require document level labelling, instead it expects manual labels only for topic nodes. This significantly minimizes the level of supervision needed as only a few topics are observed to be enough for achieving sufficiently high accuracy. The Label Propagation Algorithm is employed on this enriched graph to propagate labels among the nodes. Our approach combines the advantages of Label Propagation (through document-document similarities) and Topic Modelling (for minimal but smart supervision). We demonstrate the effectiveness of our approach on various datasets and compare with state-of-the-art weakly supervised text classification approaches.

Via

Access Paper or Ask Questions