Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Johannes Deleu

Dynamic Negative Guidance of Diffusion Models

Oct 18, 2024

Felix Koulischer, Johannes Deleu, Gabriel Raya, Thomas Demeester, Luca Ambrogioni

Figure 1 for Dynamic Negative Guidance of Diffusion Models

Figure 2 for Dynamic Negative Guidance of Diffusion Models

Figure 3 for Dynamic Negative Guidance of Diffusion Models

Figure 4 for Dynamic Negative Guidance of Diffusion Models

Abstract:Negative Prompting (NP) is widely utilized in diffusion models, particularly in text-to-image applications, to prevent the generation of undesired features. In this paper, we show that conventional NP is limited by the assumption of a constant guidance scale, which may lead to highly suboptimal results, or even complete failure, due to the non-stationarity and state-dependence of the reverse process. Based on this analysis, we derive a principled technique called Dynamic Negative Guidance, which relies on a near-optimal time and state dependent modulation of the guidance without requiring additional training. Unlike NP, negative guidance requires estimating the posterior class probability during the denoising process, which is achieved with limited additional computational overhead by tracking the discrete Markov Chain during the generative process. We evaluate the performance of DNG class-removal on MNIST and CIFAR10, where we show that DNG leads to higher safety, preservation of class balance and image quality when compared with baseline methods. Furthermore, we show that it is possible to use DNG with Stable Diffusion to obtain more accurate and less invasive guidance than NP.

* Paper currently under review. Submitted to ICLR 2025

Via

Access Paper or Ask Questions

Clinical Reasoning over Tabular Data and Text with Bayesian Networks

Mar 19, 2024

Paloma Rabaey, Johannes Deleu, Stefan Heytens, Thomas Demeester

Abstract:Bayesian networks are well-suited for clinical reasoning on tabular data, but are less compatible with natural language data, for which neural networks provide a successful framework. This paper compares and discusses strategies to augment Bayesian networks with neural text representations, both in a generative and discriminative manner. This is illustrated with simulation results for a primary care use case (diagnosis of pneumonia) and discussed in a broader clinical context.

* 10 pages, 2 figures

Via

Access Paper or Ask Questions

Exploring the Temperature-Dependent Phase Transition in Modern Hopfield Networks

Nov 30, 2023

Felix Koulischer, Cédric Goemaere, Tom van der Meersch, Johannes Deleu, Thomas Demeester

Figure 1 for Exploring the Temperature-Dependent Phase Transition in Modern Hopfield Networks

Figure 2 for Exploring the Temperature-Dependent Phase Transition in Modern Hopfield Networks

Figure 3 for Exploring the Temperature-Dependent Phase Transition in Modern Hopfield Networks

Abstract:The recent discovery of a connection between Transformers and Modern Hopfield Networks (MHNs) has reignited the study of neural networks from a physical energy-based perspective. This paper focuses on the pivotal effect of the inverse temperature hyperparameter $\beta$ on the distribution of energy minima of the MHN. To achieve this, the distribution of energy minima is tracked in a simplified MHN in which equidistant normalised patterns are stored. This network demonstrates a phase transition at a critical temperature $\beta_{\text{c}}$, from a single global attractor towards highly pattern specific minima as $\beta$ is increased. Importantly, the dynamics are not solely governed by the hyperparameter $\beta$ but are instead determined by an effective inverse temperature $\beta_{\text{eff}}$ which also depends on the distribution and size of the stored patterns. Recognizing the role of hyperparameters in the MHN could, in the future, aid researchers in the domain of Transformers to optimise their initial choices, potentially reducing the necessity for time and energy expensive hyperparameter fine-tuning.

* Accepted as poster for Associative Memory and Hopfield Networks workshop at NeurIPS23

Via

Access Paper or Ask Questions

Accelerating Hierarchical Associative Memory: A Deep Equilibrium Approach

Nov 27, 2023

Cédric Goemaere, Johannes Deleu, Thomas Demeester

Figure 1 for Accelerating Hierarchical Associative Memory: A Deep Equilibrium Approach

Figure 2 for Accelerating Hierarchical Associative Memory: A Deep Equilibrium Approach

Figure 3 for Accelerating Hierarchical Associative Memory: A Deep Equilibrium Approach

Figure 4 for Accelerating Hierarchical Associative Memory: A Deep Equilibrium Approach

Abstract:Hierarchical Associative Memory models have recently been proposed as a versatile extension of continuous Hopfield networks. In order to facilitate future research on such models, especially at scale, we focus on increasing their simulation efficiency on digital hardware. In particular, we propose two strategies to speed up memory retrieval in these models, which corresponds to their use at inference, but is equally important during training. First, we show how they can be cast as Deep Equilibrium Models, which allows using faster and more stable solvers. Second, inspired by earlier work, we show that alternating optimization of the even and odd layers accelerates memory retrieval by a factor close to two. Combined, these two techniques allow for a much faster energy minimization, as shown in our proof-of-concept experimental results. The code is available at https://github.com/cgoemaere/hamdeq

* Accepted at the "Associative Memory & Hopfield Networks'' workshop at NeurIPS, 2023

Via

Access Paper or Ask Questions

Training a Hopfield Variational Autoencoder with Equilibrium Propagation

Nov 25, 2023

Tom Van Der Meersch, Johannes Deleu, Thomas Demeester

Abstract:On dedicated analog hardware, equilibrium propagation is an energy-efficient alternative to backpropagation. In spite of its theoretical guarantees, its application in the AI domain remains limited to the discriminative setting. Meanwhile, despite its high computational demands, generative AI is on the rise. In this paper, we demonstrate the application of Equilibrium Propagation in training a variational autoencoder (VAE) for generative modeling. Leveraging the symmetric nature of Hopfield networks, we propose using a single model to serve as both the encoder and decoder which could effectively halve the required chip size for VAE implementations, paving the way for more efficient analog hardware configurations.

* Associative Memory & Hopfield Networks in 2023 (NeurIPS 2023 workshop)

Via

Access Paper or Ask Questions

Career Path Prediction using Resume Representation Learning and Skill-based Matching

Oct 24, 2023

Jens-Joris Decorte, Jeroen Van Hautte, Johannes Deleu, Chris Develder, Thomas Demeester

Figure 1 for Career Path Prediction using Resume Representation Learning and Skill-based Matching

Figure 2 for Career Path Prediction using Resume Representation Learning and Skill-based Matching

Figure 3 for Career Path Prediction using Resume Representation Learning and Skill-based Matching

Figure 4 for Career Path Prediction using Resume Representation Learning and Skill-based Matching

Abstract:The impact of person-job fit on job satisfaction and performance is widely acknowledged, which highlights the importance of providing workers with next steps at the right time in their career. This task of predicting the next step in a career is known as career path prediction, and has diverse applications such as turnover prevention and internal job mobility. Existing methods to career path prediction rely on large amounts of private career history data to model the interactions between job titles and companies. We propose leveraging the unexplored textual descriptions that are part of work experience sections in resumes. We introduce a structured dataset of 2,164 anonymized career histories, annotated with ESCO occupation labels. Based on this dataset, we present a novel representation learning approach, CareerBERT, specifically designed for work history data. We develop a skill-based model and a text-based model for career path prediction, which achieve 35.24% and 39.61% recall@10 respectively on our dataset. Finally, we show that both approaches are complementary as a hybrid approach achieves the strongest result with 43.01% recall@10.

* Accepted to the 3nd Workshop on Recommender Systems for Human Resources (RecSys in HR 2023) as part of RecSys 2023

Via

Access Paper or Ask Questions

Distractor generation for multiple-choice questions with predictive prompting and large language models

Jul 30, 2023

Semere Kiros Bitew, Johannes Deleu, Chris Develder, Thomas Demeester

Figure 1 for Distractor generation for multiple-choice questions with predictive prompting and large language models

Figure 2 for Distractor generation for multiple-choice questions with predictive prompting and large language models

Figure 3 for Distractor generation for multiple-choice questions with predictive prompting and large language models

Figure 4 for Distractor generation for multiple-choice questions with predictive prompting and large language models

Abstract:Large Language Models (LLMs) such as ChatGPT have demonstrated remarkable performance across various tasks and have garnered significant attention from both researchers and practitioners. However, in an educational context, we still observe a performance gap in generating distractors -- i.e., plausible yet incorrect answers -- with LLMs for multiple-choice questions (MCQs). In this study, we propose a strategy for guiding LLMs such as ChatGPT, in generating relevant distractors by prompting them with question items automatically retrieved from a question bank as well-chosen in-context examples. We evaluate our LLM-based solutions using a quantitative assessment on an existing test set, as well as through quality annotations by human experts, i.e., teachers. We found that on average 53% of the generated distractors presented to the teachers were rated as high-quality, i.e., suitable for immediate use as is, outperforming the state-of-the-art model. We also show the gains of our approach 1 in generating high-quality distractors by comparing it with a zero-shot ChatGPT and a few-shot ChatGPT prompted with static examples.

* 16 pages, Accepted at the 1st International Tutorial and Workshop on Responsible Knowledge Discovery in Education

Via

Access Paper or Ask Questions

Extreme Multi-Label Skill Extraction Training using Large Language Models

Jul 20, 2023

Jens-Joris Decorte, Severine Verlinden, Jeroen Van Hautte, Johannes Deleu, Chris Develder, Thomas Demeester

Figure 1 for Extreme Multi-Label Skill Extraction Training using Large Language Models

Figure 2 for Extreme Multi-Label Skill Extraction Training using Large Language Models

Figure 3 for Extreme Multi-Label Skill Extraction Training using Large Language Models

Figure 4 for Extreme Multi-Label Skill Extraction Training using Large Language Models

Abstract:Online job ads serve as a valuable source of information for skill requirements, playing a crucial role in labor market analysis and e-recruitment processes. Since such ads are typically formatted in free text, natural language processing (NLP) technologies are required to automatically process them. We specifically focus on the task of detecting skills (mentioned literally, or implicitly described) and linking them to a large skill ontology, making it a challenging case of extreme multi-label classification (XMLC). Given that there is no sizable labeled (training) dataset are available for this specific XMLC task, we propose techniques to leverage general Large Language Models (LLMs). We describe a cost-effective approach to generate an accurate, fully synthetic labeled dataset for skill extraction, and present a contrastive learning strategy that proves effective in the task. Our results across three skill extraction benchmarks show a consistent increase of between 15 to 25 percentage points in \textit{R-Precision@5} compared to previously published results that relied solely on distant supervision through literal matches.

* Accepted to the International workshop on AI for Human Resources and Public Employment Services (AI4HR&PES) as part of ECML-PKDD 2023

Via

Access Paper or Ask Questions

Learning from Partially Annotated Data: Example-aware Creation of Gap-filling Exercises for Language Learning

Jun 15, 2023

Semere Kiros Bitew, Johannes Deleu, A. Seza Doğruöz, Chris Develder, Thomas Demeester

Abstract:Since performing exercises (including, e.g., practice tests) forms a crucial component of learning, and creating such exercises requires non-trivial effort from the teacher, there is a great value in automatic exercise generation in digital tools in education. In this paper, we particularly focus on automatic creation of gapfilling exercises for language learning, specifically grammar exercises. Since providing any annotation in this domain requires human expert effort, we aim to avoid it entirely and explore the task of converting existing texts into new gap-filling exercises, purely based on an example exercise, without explicit instruction or detailed annotation of the intended grammar topics. We contribute (i) a novel neural network architecture specifically designed for aforementioned gap-filling exercise generation task, and (ii) a real-world benchmark dataset for French grammar. We show that our model for this French grammar gap-filling exercise generation outperforms a competitive baseline classifier by 8% in F1 percentage points, achieving an average F1 score of 82%. Our model implementation and the dataset are made publicly available to foster future research, thus offering a standardized evaluation and baseline solution of the proposed partially annotated data prediction task in grammar exercise creation.

* 12 pages, Accepted in the 18th Workshop on Innovative Use of NLP for Building Educational Applications

Via

Access Paper or Ask Questions

BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance

May 22, 2023

Karel D'Oosterlinck, François Remy, Johannes Deleu, Thomas Demeester, Chris Develder, Klim Zaporojets, Aneiss Ghodsi, Simon Ellershaw, Jack Collins, Christopher Potts

Figure 1 for BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance

Figure 2 for BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance

Figure 3 for BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance

Figure 4 for BioDEX: Large-Scale Biomedical Adverse Drug Event Extraction for Real-World Pharmacovigilance

Abstract:Timely and accurate extraction of Adverse Drug Events (ADE) from biomedical literature is paramount for public safety, but involves slow and costly manual labor. We set out to improve drug safety monitoring (pharmacovigilance, PV) through the use of Natural Language Processing (NLP). We introduce BioDEX, a large-scale resource for Biomedical adverse Drug Event Extraction, rooted in the historical output of drug safety reporting in the U.S. BioDEX consists of 65k abstracts and 19k full-text biomedical papers with 256k associated document-level safety reports created by medical experts. The core features of these reports include the reported weight, age, and biological sex of a patient, a set of drugs taken by the patient, the drug dosages, the reactions experienced, and whether the reaction was life threatening. In this work, we consider the task of predicting the core information of the report given its originating paper. We estimate human performance to be 72.0% F1, whereas our best model achieves 62.3% F1, indicating significant headroom on this task. We also begin to explore ways in which these models could help professional PV reviewers. Our code and data are available: https://github.com/KarelDO/BioDEX.

* 28 pages

Via

Access Paper or Ask Questions