Abstract:Natural Language Processing (NLP) aims to analyze the text via techniques in the computer science field. It serves the applications in healthcare, commerce, and education domains. Particularly, NLP has been applied to the education domain to help teaching and learning. In this survey, we review recent advances in NLP with a focus on solving problems related to the education domain. In detail, we begin with introducing the relevant background. Then, we present the taxonomy of NLP in the education domain. Next, we illustrate the task definition, challenges, and corresponding techniques based on the above taxonomy. After that, we showcase some off-the-shelf demonstrations in this domain and conclude with future directions.
Abstract:Code pre-trained models (CodePTMs) have recently demonstrated significant success in code intelligence. To interpret these models, some probing methods have been applied. However, these methods fail to consider the inherent characteristics of codes. In this paper, to address the problem, we propose a novel probing method CAT-probing to quantitatively interpret how CodePTMs attend code structure. We first denoise the input code sequences based on the token types pre-defined by the compilers to filter those tokens whose attention scores are too small. After that, we define a new metric CAT-score to measure the commonality between the token-level attention scores generated in CodePTMs and the pair-wise distances between corresponding AST nodes. The higher the CAT-score, the stronger the ability of CodePTMs to capture code structure. We conduct extensive experiments to integrate CAT-probing with representative CodePTMs for different programming languages. Experimental results show the effectiveness of CAT-probing in CodePTM interpretation. Our codes and data are publicly available at https://github.com/nchen909/CodeAttention.
Abstract:Code summarization with deep learning has been widely studied in recent years. Current deep learning models for code summarization generally follow the principle in neural machine translation and adopt the encoder-decoder framework, where the encoder learns the semantic representations from source code and the decoder transforms the learnt representations into human-readable text that describes the functionality of code snippets. Despite they achieve the new state-of-the-art performance, we notice that current models often either generate less fluent summaries, or fail to capture the core functionality, since they usually focus on a single type of code representations. As such we propose GypSum, a new deep learning model that learns hybrid representations using graph attention neural networks and a pre-trained programming and natural language model. We introduce particular edges related to the control flow of a code snippet into the abstract syntax tree for graph construction, and design two encoders to learn from the graph and the token sequence of source code, respectively. We modify the encoder-decoder sublayer in the Transformer's decoder to fuse the representations and propose a dual-copy mechanism to facilitate summary generation. Experimental results demonstrate the superior performance of GypSum over existing code summarization models.
Abstract:In this paper, we study knowledge tracing in the domain of programming education and make two important contributions. First, we harvest and publish so far the most comprehensive dataset, namely BePKT, which covers various online behaviors in an OJ system, including programming text problems, knowledge annotations, user-submitted code and system-logged events. Second, we propose a new model PDKT to exploit the enriched context for accurate student behavior prediction. More specifically, we construct a bipartite graph for programming problem embedding, and design an improved pre-training model PLCodeBERT for code embedding, as well as a double-sequence RNN model with exponential decay attention for effective feature fusion. Experimental results on the new dataset BePKT show that our proposed model establishes state-of-the-art performance in programming knowledge tracing. In addition, we verify that our code embedding strategy based on PLCodeBERT is complementary to existing knowledge tracing models to further enhance their accuracy. As a side product, PLCodeBERT also results in better performance in other programming-related tasks such as code clone detection.
Abstract:Knowledge tracing (KT) has recently been an active research area of computational pedagogy. The task is to model students mastery level of knowledge based on their responses to the questions in the past, as well as predict the probabilities that they correctly answer subsequent questions in the future. A good KT model can not only make students timely aware of their knowledge states, but also help teachers develop better personalized teaching plans for students. KT tasks were historically solved using statistical modeling methods such as Bayesian inference and factor analysis, but recent advances in deep learning have led to the successive proposals that leverage deep neural networks, including long short-term memory networks, memory-augmented networks and self-attention networks. While those deep models demonstrate superior performance over the traditional approaches, they all neglect more or less the impact on knowledge states of the most recent questions answered by students. The forgetting curve theory states that human memory retention declines over time, therefore knowledge states should be mostly affected by the recent questions. Based on this observation, we propose a Convolutional Knowledge Tracing (CKT) model in this paper. In addition to modeling the long-term effect of the entire question-answer sequence, CKT also strengthens the short-term effect of recent questions using 3D convolutions, thereby effectively modeling the forgetting curve in the learning process. Extensive experiments show that CKT achieves the new state-of-the-art in predicting students performance compared with existing models. Using CKT, we gain 1.55 and 2.03 improvements in terms of AUC over DKT and DKVMN respectively, on the ASSISTments2009 dataset. And on the ASSISTments2015 dataset, the corresponding improvements are 1.01 and 1.96 respectively.
Abstract:With the wide adoption of mobile devices, today's location tracking systems such as satellites, cellular base stations and wireless access points are continuously producing tremendous amounts of location data of moving objects. The ability to discover moving objects that travel together, i.e., traveling companions, from their trajectories is desired by many applications such as intelligent transportation systems and location-based services. Existing algorithms are either based on pattern mining methods that define a particular pattern of traveling companions or based on representation learning methods that learn similar representations for similar trajectories. The former methods suffer from the pairwise point-matching problem and the latter often ignore the temporal proximity between trajectories. In this work, we propose a generic deep representation learning model using autoencoders, namely, ATTN-MEAN, for the discovery of traveling companions. ATTN-MEAN collectively injects spatial and temporal information into its input embeddings using skip-gram, positional encoding techniques, respectively. Besides, our model further encourages trajectories to learn from their neighbours by leveraging the Sort-Tile-Recursive algorithm, mean operation and global attention mechanism. After obtaining the representations from the encoders, we run DBSCAN to cluster the representations to find travelling companion. The corresponding trajectories in the same cluster are considered as traveling companions. Experimental results suggest that ATTN-MEAN performs better than the state-of-the-art algorithms on finding traveling companions.