Abstract:Machine learning models are often criticized for their black-box nature, raising concerns about their applicability in critical decision-making scenarios. Consequently, there is a growing demand for interpretable models in such contexts. In this study, we introduce Model-based Deep Rule Forests (mobDRF), an interpretable representation learning algorithm designed to extract transparent models from data. By leveraging IF-THEN rules with multi-level logic expressions, mobDRF enhances the interpretability of existing models without compromising accuracy. We apply mobDRF to identify key risk factors for cognitive decline in an elderly population, demonstrating its effectiveness in subgroup analysis and local model optimization. Our method offers a promising solution for developing trustworthy and interpretable machine learning models, particularly valuable in fields like healthcare, where understanding differential effects across patient subgroups can lead to more personalized and effective treatments.
Abstract:Researchers have been overwhelmed by the explosion of research articles published by various research communities. Many research scholarly websites, search engines, and digital libraries have been created to help researchers identify potential research topics and keep up with recent progress on research of interests. However, it is still difficult for researchers to keep track of the research topic diffusion and evolution without spending a large amount of time reviewing numerous relevant and irrelevant articles. In this paper, we consider a novel topic diffusion discovery technique. Specifically, we propose using a Deep Non-negative Autoencoder with information divergence measurement that monitors evolutionary distance of the topic diffusion to understand how research topics change with time. The experimental results show that the proposed approach is able to identify the evolution of research topics as well as to discover topic diffusions in online fashions.
Abstract:Many Machine Learning algorithms, such as deep neural networks, have long been criticized for being "black-boxes"-a kind of models unable to provide how it arrive at a decision without further efforts to interpret. This problem has raised concerns on model applications' trust, safety, nondiscrimination, and other ethical issues. In this paper, we discuss the machine learning interpretability of a real-world application, eXtreme Multi-label Learning (XML), which involves learning models from annotated data with many pre-defined labels. We propose a two-step XML approach that combines deep non-negative autoencoder with other multi-label classifiers to tackle different data applications with a large number of labels. Our experimental result shows that the proposed approach is able to cope with many-label problems as well as to provide interpretable label hierarchies and dependencies that helps us understand how the model recognizes the existences of objects in an image.
Abstract:Due to recent explosion of text data, researchers have been overwhelmed by ever-increasing volume of articles produced by different research communities. Various scholarly search websites, citation recommendation engines, and research databases have been created to simplify the text search tasks. However, it is still difficult for researchers to be able to identify potential research topics without doing intensive reviews on a tremendous number of articles published by journals, conferences, meetings, and workshops. In this paper, we consider a novel topic diffusion discovery technique that incorporates sparseness-constrained Non-negative Matrix Factorization with generalized Jensen-Shannon divergence to help understand term-topic evolutions and identify topic diffusions. Our experimental result shows that this approach can extract more prominent topics from large article databases, visualize relationships between terms of interest and abstract topics, and further help researchers understand whether given terms/topics have been widely explored or whether new topics are emerging from literature.