Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Sven Weinzierl

CareerBERT: Matching Resumes to ESCO Jobs in a Shared Embedding Space for Generic Job Recommendations

Mar 03, 2025

Julian Rosenberger, Lukas Wolfrum, Sven Weinzierl, Mathias Kraus, Patrick Zschech

Abstract:The rapidly evolving labor market, driven by technological advancements and economic shifts, presents significant challenges for traditional job matching and consultation services. In response, we introduce an advanced support tool for career counselors and job seekers based on CareerBERT, a novel approach that leverages the power of unstructured textual data sources, such as resumes, to provide more accurate and comprehensive job recommendations. In contrast to previous approaches that primarily focus on job recommendations based on a fixed set of concrete job advertisements, our approach involves the creation of a corpus that combines data from the European Skills, Competences, and Occupations (ESCO) taxonomy and EURopean Employment Services (EURES) job advertisements, ensuring an up-to-date and well-defined representation of general job titles in the labor market. Our two-step evaluation approach, consisting of an application-grounded evaluation using EURES job advertisements and a human-grounded evaluation using real-world resumes and Human Resources (HR) expert feedback, provides a comprehensive assessment of CareerBERT's performance. Our experimental results demonstrate that CareerBERT outperforms both traditional and state-of-the-art embedding approaches while showing robust effectiveness in human expert evaluations. These results confirm the effectiveness of CareerBERT in supporting career consultants by generating relevant job recommendations based on resumes, ultimately enhancing the efficiency of job consultations and expanding the perspectives of job seekers. This research contributes to the field of NLP and job recommendation systems, offering valuable insights for both researchers and practitioners in the domain of career consulting and job matching.

* Accepted at Expert Systems with Applications. In Press, see https://doi.org/10.1016/j.eswa.2025.127043

Via

Access Paper or Ask Questions

(Neural-Symbolic) Machine Learning for Inconsistency Measurement

Feb 05, 2025

Sven Weinzierl, Carl Cora

Figure 1 for (Neural-Symbolic) Machine Learning for Inconsistency Measurement

Figure 2 for (Neural-Symbolic) Machine Learning for Inconsistency Measurement

Figure 3 for (Neural-Symbolic) Machine Learning for Inconsistency Measurement

Figure 4 for (Neural-Symbolic) Machine Learning for Inconsistency Measurement

Abstract:We present machine-learning-based approaches for determining the \emph{degree} of inconsistency -- which is a numerical value -- for propositional logic knowledge bases. Specifically, we present regression- and neural-based models that learn to predict the values that the inconsistency measures $\incmi$ and $\incat$ would assign to propositional logic knowledge bases. Our main motivation is that computing these values conventionally can be hard complexity-wise. As an important addition, we use specific postulates, that is, properties, of the underlying inconsistency measures to infer symbolic rules, which we combine with the learning-based models in the form of constraints. We perform various experiments and show that a) predicting the degree values is feasible in many situations, and b) including the symbolic constraints deduced from the rationality postulates increases the prediction quality.

Via

Access Paper or Ask Questions

Challenging the Performance-Interpretability Trade-off: An Evaluation of Interpretable Machine Learning Models

Sep 22, 2024

Sven Kruschel, Nico Hambauer, Sven Weinzierl, Sandra Zilker, Mathias Kraus, Patrick Zschech

Abstract:Machine learning is permeating every conceivable domain to promote data-driven decision support. The focus is often on advanced black-box models due to their assumed performance advantages, whereas interpretable models are often associated with inferior predictive qualities. More recently, however, a new generation of generalized additive models (GAMs) has been proposed that offer promising properties for capturing complex, non-linear patterns while remaining fully interpretable. To uncover the merits and limitations of these models, this study examines the predictive performance of seven different GAMs in comparison to seven commonly used machine learning models based on a collection of twenty tabular benchmark datasets. To ensure a fair and robust model comparison, an extensive hyperparameter search combined with cross-validation was performed, resulting in 68,500 model runs. In addition, this study qualitatively examines the visual output of the models to assess their level of interpretability. Based on these results, the paper dispels the misconception that only black-box models can achieve high accuracy by demonstrating that there is no strict trade-off between predictive performance and model interpretability for tabular data. Furthermore, the paper discusses the importance of GAMs as powerful interpretable models for the field of information systems and derives implications for future work from a socio-technical perspective.

* Accepted for publication in Business & Information Systems Engineering (2024)

Via

Access Paper or Ask Questions

Documentation Practices of Artificial Intelligence

Jun 26, 2024

Stefan Arnold, Dilara Yesilbas, Rene Gröbner, Dominik Riedelbauch, Maik Horn, Sven Weinzierl

Abstract:Artificial Intelligence (AI) faces persistent challenges in terms of transparency and accountability, which requires rigorous documentation. Through a literature review on documentation practices, we provide an overview of prevailing trends, persistent issues, and the multifaceted interplay of factors influencing the documentation. Our examination of key characteristics such as scope, target audiences, support for multimodality, and level of automation, highlights a dynamic evolution in documentation practices, underscored by a shift towards a more holistic, engaging, and automated documentation.

Via

Access Paper or Ask Questions

Recent Advances in Data-Driven Business Process Management

Jun 03, 2024

Lars Ackermann, Martin Käppel, Laura Marcus, Linda Moder, Sebastian Dunzer, Markus Hornsteiner, Annina Liessmann, Yorck Zisgen, Philip Empl, Lukas-Valentin Herm(+14 more)

Figure 1 for Recent Advances in Data-Driven Business Process Management

Figure 2 for Recent Advances in Data-Driven Business Process Management

Figure 3 for Recent Advances in Data-Driven Business Process Management

Figure 4 for Recent Advances in Data-Driven Business Process Management

Abstract:The rapid development of cutting-edge technologies, the increasing volume of data and also the availability and processability of new types of data sources has led to a paradigm shift in data-based management and decision-making. Since business processes are at the core of organizational work, these developments heavily impact BPM as a crucial success factor for organizations. In view of this emerging potential, data-driven business process management has become a relevant and vibrant research area. Given the complexity and interdisciplinarity of the research field, this position paper therefore presents research insights regarding data-driven BPM.

* position paper, 34 pages, 10 figures

Via

Access Paper or Ask Questions

Machine learning in business process management: A systematic literature review

May 26, 2024

Sven Weinzierl, Sandra Zilker, Sebastian Dunzer, Martin Matzner

Abstract:Machine learning (ML) provides algorithms to create computer programs based on data without explicitly programming them. In business process management (BPM), ML applications are used to analyse and improve processes efficiently. Three frequent examples of using ML are providing decision support through predictions, discovering accurate process models, and improving resource allocation. This paper organises the body of knowledge on ML in BPM. We extract BPM tasks from different literature streams, summarise them under the phases of a process`s lifecycle, explain how ML helps perform these tasks and identify technical commonalities in ML implementations across tasks. This study is the first exhaustive review of how ML has been used in BPM. We hope that it can open the door for a new era of cumulative research by helping researchers to identify relevant preliminary work and then combine and further develop existing approaches in a focused fashion. Our paper helps managers and consultants to find ML applications that are relevant in the current project phase of a BPM initiative, like redesigning a business process. We also offer - as a synthesis of our review - a research agenda that spreads ten avenues for future research, including applying novel ML concepts like federated learning, addressing less regarded BPM lifecycle phases like process identification, and delivering ML applications with a focus on end-users.

Via

Access Paper or Ask Questions

Driving Context into Text-to-Text Privatization

Jun 02, 2023

Stefan Arnold, Dilara Yesilbas, Sven Weinzierl

Abstract:\textit{Metric Differential Privacy} enables text-to-text privatization by adding calibrated noise to the vector of a word derived from an embedding space and projecting this noisy vector back to a discrete vocabulary using a nearest neighbor search. Since words are substituted without context, this mechanism is expected to fall short at finding substitutes for words with ambiguous meanings, such as \textit{'bank'}. To account for these ambiguous words, we leverage a sense embedding and incorporate a sense disambiguation step prior to noise injection. We encompass our modification to the privatization mechanism with an estimation of privacy and utility. For word sense disambiguation on the \textit{Words in Context} dataset, we demonstrate a substantial increase in classification accuracy by $6.05\%$.

Via

Access Paper or Ask Questions

Guiding Text-to-Text Privatization by Syntax

Jun 02, 2023

Stefan Arnold, Dilara Yesilbas, Sven Weinzierl

Abstract:Metric Differential Privacy is a generalization of differential privacy tailored to address the unique challenges of text-to-text privatization. By adding noise to the representation of words in the geometric space of embeddings, words are replaced with words located in the proximity of the noisy representation. Since embeddings are trained based on word co-occurrences, this mechanism ensures that substitutions stem from a common semantic context. Without considering the grammatical category of words, however, this mechanism cannot guarantee that substitutions play similar syntactic roles. We analyze the capability of text-to-text privatization to preserve the grammatical category of words after substitution and find that surrogate texts consist almost exclusively of nouns. Lacking the capability to produce surrogate texts that correlate with the structure of the sensitive texts, we encompass our analysis by transforming the privatization step into a candidate selection problem in which substitutions are directed to words with matching grammatical properties. We demonstrate a substantial improvement in the performance of downstream tasks by up to $4.66\%$ while retaining comparative privacy guarantees.

Via

Access Paper or Ask Questions

GAM changer or not? An evaluation of interpretable machine learning models based on additive model constraints

Apr 19, 2022

Patrick Zschech, Sven Weinzierl, Nico Hambauer, Sandra Zilker, Mathias Kraus

Figure 1 for GAM changer or not? An evaluation of interpretable machine learning models based on additive model constraints

Figure 2 for GAM changer or not? An evaluation of interpretable machine learning models based on additive model constraints

Figure 3 for GAM changer or not? An evaluation of interpretable machine learning models based on additive model constraints

Figure 4 for GAM changer or not? An evaluation of interpretable machine learning models based on additive model constraints

Abstract:The number of information systems (IS) studies dealing with explainable artificial intelligence (XAI) is currently exploding as the field demands more transparency about the internal decision logic of machine learning (ML) models. However, most techniques subsumed under XAI provide post-hoc-analytical explanations, which have to be considered with caution as they only use approximations of the underlying ML model. Therefore, our paper investigates a series of intrinsically interpretable ML models and discusses their suitability for the IS community. More specifically, our focus is on advanced extensions of generalized additive models (GAM) in which predictors are modeled independently in a non-linear way to generate shape functions that can capture arbitrary patterns but remain fully interpretable. In our study, we evaluate the prediction qualities of five GAMs as compared to six traditional ML models and assess their visual outputs for model interpretability. On this basis, we investigate their merits and limitations and derive design implications for further improvements.

* Preprint accepted for archival and presentation at the 30th European Conference on Information Systems (ECIS 2022)

Via

Access Paper or Ask Questions

Time Matters: Time-Aware LSTMs for Predictive Business Process Monitoring

Oct 16, 2020

An Nguyen, Srijeet Chatterjee, Sven Weinzierl, Leo Schwinn, Martin Matzner, Bjoern Eskofier

Figure 1 for Time Matters: Time-Aware LSTMs for Predictive Business Process Monitoring

Figure 2 for Time Matters: Time-Aware LSTMs for Predictive Business Process Monitoring

Figure 3 for Time Matters: Time-Aware LSTMs for Predictive Business Process Monitoring

Figure 4 for Time Matters: Time-Aware LSTMs for Predictive Business Process Monitoring

Abstract:Predictive business process monitoring (PBPM) aims to predict future process behavior during ongoing process executions based on event log data. Especially, techniques for the next activity and timestamp prediction can help to improve the performance of operational business processes. Recently, many PBPM solutions based on deep learning were proposed by researchers. Due to the sequential nature of event log data, a common choice is to apply recurrent neural networks with long short-term memory (LSTM) cells. We argue, that the elapsed time between events is informative. However, current PBPM techniques mainly use 'vanilla' LSTM cells and hand-crafted time-related control flow features. To better model the time dependencies between events, we propose a new PBPM technique based on time-aware LSTM (T-LSTM) cells. T-LSTM cells incorporate the elapsed time between consecutive events inherently to adjust the cell memory. Furthermore, we introduce cost-sensitive learning to account for the common class imbalance in event logs. Our experiments on publicly available benchmark event logs indicate the effectiveness of the introduced techniques.

* 12 pages, 4 figures, to be published in post-workshop proceedings volume in the series Lecture Notes in Business Information Processing (LNBIP) - 1st International Workshop on Leveraging Machine Learning in Process Mining (ML4PM) @ ICPM 2020

Via

Access Paper or Ask Questions