Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Anton Thielmann

Beyond Black-Box Predictions: Identifying Marginal Feature Effects in Tabular Transformer Networks

Apr 11, 2025

Anton Thielmann, Arik Reuter, Benjamin Saefken

Abstract:In recent years, deep neural networks have showcased their predictive power across a variety of tasks. Beyond natural language processing, the transformer architecture has proven efficient in addressing tabular data problems and challenges the previously dominant gradient-based decision trees in these areas. However, this predictive power comes at the cost of intelligibility: Marginal feature effects are almost completely lost in the black-box nature of deep tabular transformer networks. Alternative architectures that use the additivity constraints of classical statistical regression models can maintain intelligible marginal feature effects, but often fall short in predictive power compared to their more complex counterparts. To bridge the gap between intelligibility and performance, we propose an adaptation of tabular transformer networks designed to identify marginal feature effects. We provide theoretical justifications that marginal feature effects can be accurately identified, and our ablation study demonstrates that the proposed model efficiently detects these effects, even amidst complex feature interactions. To demonstrate the model's predictive capabilities, we compare it to several interpretable as well as black-box models and find that it can match black-box performances while maintaining intelligibility. The source code is available at https://github.com/OpenTabular/NAMpy.

Via

Access Paper or Ask Questions

GPTopic: Dynamic and Interactive Topic Representations

Mar 06, 2024

Arik Reuter, Anton Thielmann, Christoph Weisser, Sebastian Fischer, Benjamin Säfken

Figure 1 for GPTopic: Dynamic and Interactive Topic Representations

Figure 2 for GPTopic: Dynamic and Interactive Topic Representations

Abstract:Topic modeling seems to be almost synonymous with generating lists of top words to represent topics within large text corpora. However, deducing a topic from such list of individual terms can require substantial expertise and experience, making topic modelling less accessible to people unfamiliar with the particularities and pitfalls of top-word interpretation. A topic representation limited to top-words might further fall short of offering a comprehensive and easily accessible characterization of the various aspects, facets and nuances a topic might have. To address these challenges, we introduce GPTopic, a software package that leverages Large Language Models (LLMs) to create dynamic, interactive topic representations. GPTopic provides an intuitive chat interface for users to explore, analyze, and refine topics interactively, making topic modeling more accessible and comprehensive. The corresponding code is available here: https://github. com/05ec6602be/GPTopic.

Via

Access Paper or Ask Questions

Probabilistic Topic Modelling with Transformer Representations

Mar 06, 2024

Arik Reuter, Anton Thielmann, Christoph Weisser, Benjamin Säfken, Thomas Kneib

Abstract:Topic modelling was mostly dominated by Bayesian graphical models during the last decade. With the rise of transformers in Natural Language Processing, however, several successful models that rely on straightforward clustering approaches in transformer-based embedding spaces have emerged and consolidated the notion of topics as clusters of embedding vectors. We propose the Transformer-Representation Neural Topic Model (TNTM), which combines the benefits of topic representations in transformer-based embedding spaces and probabilistic modelling. Therefore, this approach unifies the powerful and versatile notion of topics based on transformer embeddings with fully probabilistic modelling, as in models such as Latent Dirichlet Allocation (LDA). We utilize the variational autoencoder (VAE) framework for improved inference speed and modelling flexibility. Experimental results show that our proposed model achieves results on par with various state-of-the-art approaches in terms of embedding coherence while maintaining almost perfect topic diversity. The corresponding source code is available at https://github.com/ArikReuter/TNTM.

Via

Access Paper or Ask Questions

Topics in the Haystack: Extracting and Evaluating Topics beyond Coherence

Mar 30, 2023

Anton Thielmann, Quentin Seifert, Arik Reuter, Elisabeth Bergherr, Benjamin Säfken

Figure 1 for Topics in the Haystack: Extracting and Evaluating Topics beyond Coherence

Figure 2 for Topics in the Haystack: Extracting and Evaluating Topics beyond Coherence

Figure 3 for Topics in the Haystack: Extracting and Evaluating Topics beyond Coherence

Figure 4 for Topics in the Haystack: Extracting and Evaluating Topics beyond Coherence

Abstract:Extracting and identifying latent topics in large text corpora has gained increasing importance in Natural Language Processing (NLP). Most models, whether probabilistic models similar to Latent Dirichlet Allocation (LDA) or neural topic models, follow the same underlying approach of topic interpretability and topic extraction. We propose a method that incorporates a deeper understanding of both sentence and document themes, and goes beyond simply analyzing word frequencies in the data. This allows our model to detect latent topics that may include uncommon words or neologisms, as well as words not present in the documents themselves. Additionally, we propose several new evaluation metrics based on intruder words and similarity measures in the semantic space. We present correlation coefficients with human identification of intruder words and achieve near-human level results at the word-intrusion task. We demonstrate the competitive performance of our method with a large benchmark study, and achieve superior results compared to state-of-the-art topic modeling and document clustering models.

Via

Access Paper or Ask Questions

Structural Neural Additive Models: Enhanced Interpretable Machine Learning

Feb 18, 2023

Mattias Luber, Anton Thielmann, Benjamin Säfken

Figure 1 for Structural Neural Additive Models: Enhanced Interpretable Machine Learning

Figure 2 for Structural Neural Additive Models: Enhanced Interpretable Machine Learning

Figure 3 for Structural Neural Additive Models: Enhanced Interpretable Machine Learning

Figure 4 for Structural Neural Additive Models: Enhanced Interpretable Machine Learning

Abstract:Deep neural networks (DNNs) have shown exceptional performances in a wide range of tasks and have become the go-to method for problems requiring high-level predictive power. There has been extensive research on how DNNs arrive at their decisions, however, the inherently uninterpretable networks remain up to this day mostly unobservable "black boxes". In recent years, the field has seen a push towards interpretable neural networks, such as the visually interpretable Neural Additive Models (NAMs). We propose a further step into the direction of intelligibility beyond the mere visualization of feature effects and propose Structural Neural Additive Models (SNAMs). A modeling framework that combines classical and clearly interpretable statistical methods with the predictive power of neural applications. Our experiments validate the predictive performances of SNAMs. The proposed framework performs comparable to state-of-the-art fully connected DNNs and we show that SNAMs can even outperform NAMs while remaining inherently more interpretable.

Via

Access Paper or Ask Questions

Neural Additive Models for Location Scale and Shape: A Framework for Interpretable Neural Regression Beyond the Mean

Jan 27, 2023

Anton Thielmann, René-Marcel Kruse, Thomas Kneib, Benjamin Säfken

Abstract:Deep neural networks (DNNs) have proven to be highly effective in a variety of tasks, making them the go-to method for problems requiring high-level predictive power. Despite this success, the inner workings of DNNs are often not transparent, making them difficult to interpret or understand. This lack of interpretability has led to increased research on inherently interpretable neural networks in recent years. Models such as Neural Additive Models (NAMs) achieve visual interpretability through the combination of classical statistical methods with DNNs. However, these approaches only concentrate on mean response predictions, leaving out other properties of the response distribution of the underlying data. We propose Neural Additive Models for Location Scale and Shape (NAMLSS), a modelling framework that combines the predictive power of classical deep learning models with the inherent advantages of distributional regression while maintaining the interpretability of additive models.

Via

Access Paper or Ask Questions

Human in the loop: How to effectively create coherent topics by manually labeling only a few documents per class

Dec 19, 2022

Anton Thielmann, Christoph Weisser, Benjamin Säfken

Abstract:Few-shot methods for accurate modeling under sparse label-settings have improved significantly. However, the applications of few-shot modeling in natural language processing remain solely in the field of document classification. With recent performance improvements, supervised few-shot methods, combined with a simple topic extraction method pose a significant challenge to unsupervised topic modeling methods. Our research shows that supervised few-shot learning, combined with a simple topic extraction method, can outperform unsupervised topic modeling techniques in terms of generating coherent topics, even when only a few labeled documents per class are used.

Via

Access Paper or Ask Questions

Community-Detection via Hashtag-Graphs for Semi-Supervised NMF Topic Models

Nov 17, 2021

Mattias Luber, Anton Thielmann, Christoph Weisser, Benjamin Säfken

Figure 1 for Community-Detection via Hashtag-Graphs for Semi-Supervised NMF Topic Models

Figure 2 for Community-Detection via Hashtag-Graphs for Semi-Supervised NMF Topic Models

Figure 3 for Community-Detection via Hashtag-Graphs for Semi-Supervised NMF Topic Models

Figure 4 for Community-Detection via Hashtag-Graphs for Semi-Supervised NMF Topic Models

Abstract:Extracting topics from large collections of unstructured text-documents has become a central task in current NLP applications and algorithms like NMF, LDA as well as their generalizations are the well-established current state of the art. However, especially when it comes to short text documents like Tweets, these approaches often lead to unsatisfying results due to the sparsity of the document-feature matrices. Even though, several approaches have been proposed to overcome this sparsity by taking additional information into account, these are merely focused on the aggregation of similar documents and the estimation of word-co-occurrences. This ultimately completely neglects the fact that a lot of topical-information can be actually retrieved from so-called hashtag-graphs by applying common community detection algorithms. Therefore, this paper outlines a novel approach on how to integrate topic structures of hashtag graphs into the estimation of topic models by connecting graph-based community detection and semi-supervised NMF. By applying this approach on recently streamed Twitter data it will be seen that this procedure actually leads to more intuitive and humanly interpretable topics.

Via

Access Paper or Ask Questions