Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Meng Chang Chen

Institute of Information Science, Academia Sinica, Taipei, Taiwan

Headline Diagnosis: Manipulation of Content Farm Headlines

Apr 25, 2022

Yu-Chieh Chen, Pei-Yu Huang, Chun Lin, Yi-Ting Huang, Meng Chang Chen

Figure 1 for Headline Diagnosis: Manipulation of Content Farm Headlines

Figure 2 for Headline Diagnosis: Manipulation of Content Farm Headlines

Figure 3 for Headline Diagnosis: Manipulation of Content Farm Headlines

Figure 4 for Headline Diagnosis: Manipulation of Content Farm Headlines

Abstract:As technology grows faster, the news spreads through social media. In order to attract more readers and acquire additional profit, some news agencies reproduce massive news in a more appealing manner. Therefore, it is essential to accurately predict whether a news article is from official news agencies. This work develops a headline classification based on Convoluted Neural Network to determine credibility of a news article. The model primarily focuses on investigating key factors from headlines. These factors include word segmentation, part-of-speech tags, and sentiment features. With integrating these features into the proposed classification model, the demonstrated evaluation achieves 93.99% for accuracy.

* Taiwan Academic Network Conference 26(2020) 680-684
* Accepted by The 26th Taiwan Academic Network Conference (TANET) 2020

Via

Access Paper or Ask Questions

Integration of Static and Dynamic Analysis for Malware Family Classification with Composite Neural Network

Dec 24, 2019

Yao Saint Yen, Zhe Wei Chen, Ying Ren Guo, Meng Chang Chen

Figure 1 for Integration of Static and Dynamic Analysis for Malware Family Classification with Composite Neural Network

Figure 2 for Integration of Static and Dynamic Analysis for Malware Family Classification with Composite Neural Network

Figure 3 for Integration of Static and Dynamic Analysis for Malware Family Classification with Composite Neural Network

Figure 4 for Integration of Static and Dynamic Analysis for Malware Family Classification with Composite Neural Network

Abstract:Deep learning has been used in the research of malware analysis. Most classification methods use either static analysis features or dynamic analysis features for malware family classification, and rarely combine them as classification features and also no extra effort is spent integrating the two types of features. In this paper, we combine static and dynamic analysis features with deep neural networks for Windows malware classification. We develop several methods to generate static and dynamic analysis features to classify malware in different ways. Given these features, we conduct experiments with composite neural network, showing that the proposed approach performs best with an accuracy of 83.17% on a total of 80 malware families with 4519 malware samples. Additionally, we show that using integrated features for malware family classification outperforms using static features or dynamic features alone. We show how static and dynamic features complement each other for malware classification.

Via

Access Paper or Ask Questions

Learning Malware Representation based on Execution Sequences

Dec 16, 2019

Yi-Ting Huang, Ting-Yi Chen, Yeali S. Sun, Meng Chang Chen

Figure 1 for Learning Malware Representation based on Execution Sequences

Figure 2 for Learning Malware Representation based on Execution Sequences

Figure 3 for Learning Malware Representation based on Execution Sequences

Figure 4 for Learning Malware Representation based on Execution Sequences

Abstract:Malware analysis has been extensively investigated as the number and types of malware has increased dramatically. However, most previous studies use end-to-end systems to detect whether a sample is malicious, or to identify its malware family. In this paper, we propose a neural network framework composed of an embedder, an encoder, and a filter to learn malware representations from characteristic execution sequences for malware family classification. The embedder uses BERT and Sent2Vec, state-of-the-art embedding modules, to capture relations within a single API call and among consecutive API calls in an execution trace. The encoder comprises gated recurrent units (GRU) to preserve the ordinal position of API calls and a self-attention mechanism for comparing intra-relations among different positions of API calls. The filter identifies representative API calls to build the malware representation. We conduct broad experiments to determine the influence of individual framework components. The results show that the proposed framework outperforms the baselines, and also demonstrates that considering Sent2Vec to learn complete API call embeddings and GRU to explicitly preserve ordinal information yields more information and thus significant improvements. Also, the proposed approach effectively classifies new malicious execution traces on the basis of similarities with previously collected families.

Via

Access Paper or Ask Questions

Composite Neural Network: Theory and Application to PM2.5 Prediction

Oct 22, 2019

Ming-Chuan Yang, Meng Chang Chen

Figure 1 for Composite Neural Network: Theory and Application to PM2.5 Prediction

Figure 2 for Composite Neural Network: Theory and Application to PM2.5 Prediction

Figure 3 for Composite Neural Network: Theory and Application to PM2.5 Prediction

Figure 4 for Composite Neural Network: Theory and Application to PM2.5 Prediction

Abstract:This work investigates the framework and performance issues of the composite neural network, which is composed of a collection of pre-trained and non-instantiated neural network models connected as a rooted directed acyclic graph for solving complicated applications. A pre-trained neural network model is generally well trained, targeted to approximate a specific function. Despite a general belief that a composite neural network may perform better than a single component, the overall performance characteristics are not clear. In this work, we construct the framework of a composite network, and prove that a composite neural network performs better than any of its pre-trained components with a high probability bound. In addition, if an extra pre-trained component is added to a composite network, with high probability, the overall performance will not be degraded. In the study, we explore a complicated application---PM2.5 prediction---to illustrate the correctness of the proposed composite network theory. In the empirical evaluations of PM2.5 prediction, the constructed composite neural network models support the proposed theory and perform better than other machine learning models, demonstrate the advantages of the proposed framework.

Via

Access Paper or Ask Questions

Theoretical Investigation of Composite Neural Network

Oct 18, 2019

Ming-Chuan Yang, Meng Chang Chen

Figure 1 for Theoretical Investigation of Composite Neural Network

Figure 2 for Theoretical Investigation of Composite Neural Network

Figure 3 for Theoretical Investigation of Composite Neural Network

Figure 4 for Theoretical Investigation of Composite Neural Network

Abstract:A composite neural network is a rooted directed acyclic graph combining a set of pre-trained and non-instantiated neural network models. A pre-trained neural network model is well-crafted for a specific task and with instantiated weights. is generally well trained, targeted to approximate a specific function. Despite a general belief that a composite neural network may perform better than a single component, the overall performance characteristics are not clear. In this work, we prove that there exist parameters such that a composite neural network performs better than any of its pre-trained components with a high probability bound.

Via

Access Paper or Ask Questions

Bringing personalized learning into computer-aided question generation

Aug 29, 2018

Yi-Ting Huang, Meng Chang Chen, Yeali S. Sun

Figure 1 for Bringing personalized learning into computer-aided question generation

Figure 2 for Bringing personalized learning into computer-aided question generation

Figure 3 for Bringing personalized learning into computer-aided question generation

Figure 4 for Bringing personalized learning into computer-aided question generation

Abstract:This paper proposes a novel and statistical method of ability estimation based on acquisition distribution for a personalized computer aided question generation. This method captures the learning outcomes over time and provides a flexible measurement based on the acquisition distributions instead of precalibration. Compared to the previous studies, the proposed method is robust, especially when an ability of a student is unknown. The results from the empirical data show that the estimated abilities match the actual abilities of learners, and the pretest and post-test of the experimental group show significant improvement. These results suggest that this method can serves as the ability estimation for a personalized computer-aided testing environment.

Via

Access Paper or Ask Questions

Development and Evaluation of a Personalized Computer-aided Question Generation for English Learners to Improve Proficiency and Correct Mistakes

Aug 29, 2018

Yi-Ting Huang, Meng Chang Chen, Yeali S. Sun

Figure 1 for Development and Evaluation of a Personalized Computer-aided Question Generation for English Learners to Improve Proficiency and Correct Mistakes

Figure 2 for Development and Evaluation of a Personalized Computer-aided Question Generation for English Learners to Improve Proficiency and Correct Mistakes

Figure 3 for Development and Evaluation of a Personalized Computer-aided Question Generation for English Learners to Improve Proficiency and Correct Mistakes

Figure 4 for Development and Evaluation of a Personalized Computer-aided Question Generation for English Learners to Improve Proficiency and Correct Mistakes

Abstract:In the last several years, the field of computer assisted language learning has increasingly focused on computer aided question generation. However, this approach often provides test takers with an exhaustive amount of questions that are not designed for any specific testing purpose. In this work, we present a personalized computer aided question generation that generates multiple choice questions at various difficulty levels and types, including vocabulary, grammar and reading comprehension. In order to improve the weaknesses of test takers, it selects questions depending on an estimated proficiency level and unclear concepts behind incorrect responses. This results show that the students with the personalized automatic quiz generation corrected their mistakes more frequently than ones only with computer aided question generation. Moreover, students demonstrated the most progress between the pretest and post test and correctly answered more difficult questions. Finally, we investigated the personalizing strategy and found that a student could make a significant progress if the proposed system offered the vocabulary questions at the same level of his or her proficiency level, and if the grammar and reading comprehension questions were at a level lower than his or her proficiency level.

Via

Access Paper or Ask Questions

Characterizing the Influence of Features on Reading Difficulty Estimation for Non-native Readers

Aug 29, 2018

Yi-Ting Huang, Meng Chang Chen, Yeali S. Sun

Figure 1 for Characterizing the Influence of Features on Reading Difficulty Estimation for Non-native Readers

Figure 2 for Characterizing the Influence of Features on Reading Difficulty Estimation for Non-native Readers

Figure 3 for Characterizing the Influence of Features on Reading Difficulty Estimation for Non-native Readers

Figure 4 for Characterizing the Influence of Features on Reading Difficulty Estimation for Non-native Readers

Abstract:In recent years, the number of people studying English as a second language (ESL) has surpassed the number of native speakers. Recent work have demonstrated the success of providing personalized content based on reading difficulty, such as information retrieval and summarization. However, almost all prior studies of reading difficulty are designed for native speakers, rather than non-native readers. In this study, we investigate various features for ESL readers, by conducting a linear regression to estimate the reading level of English language sources. This estimation is based not only on the complexity of lexical and syntactic features, but also several novel concepts, including the age of word and grammar acquisition from several sources, word sense from WordNet, and the implicit relation between sentences. By employing Bayesian Information Criterion (BIC) to select the optimal model, we find that the combination of the number of words, the age of word acquisition and the height of the parsing tree generate better results than alternative competing models. Thus, our results show that proposed second language reading difficulty estimation outperforms other first language reading difficulty estimations.

Via

Access Paper or Ask Questions

Wrapped Loss Function for Regularizing Nonconforming Residual Distributions

Aug 21, 2018

Chun Ting Liu, Ming Chuan Yang, Meng Chang Chen

Figure 1 for Wrapped Loss Function for Regularizing Nonconforming Residual Distributions

Figure 2 for Wrapped Loss Function for Regularizing Nonconforming Residual Distributions

Figure 3 for Wrapped Loss Function for Regularizing Nonconforming Residual Distributions

Figure 4 for Wrapped Loss Function for Regularizing Nonconforming Residual Distributions

Abstract:Multi-output is essential in machine learning that it might suffer from nonconforming residual distributions, i.e., the multi-output residual distributions are not conforming to the expected distribution. In this paper we propose "Wrapped Loss Function" to wrap the original loss function to alleviate the problem. This wrapped loss function acts just like original loss function that its gradient can be used for backpropagation optimization. Empirical evaluations show wrapped loss function has advanced properties of faster convergence, better accuracy and improving imbalanced data.

* 10 pages

Via

Access Paper or Ask Questions