Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Kostadin Cvejoski

Foundation Inference Models for Stochastic Differential Equations: A Transformer-based Approach for Zero-shot Function Estimation

Feb 26, 2025

Patrick Seifner, Kostadin Cvejoski, David Berghaus, Cesar Ojeda, Ramses J. Sanchez

Abstract:Stochastic differential equations (SDEs) describe dynamical systems where deterministic flows, governed by a drift function, are superimposed with random fluctuations dictated by a diffusion function. The accurate estimation (or discovery) of these functions from data is a central problem in machine learning, with wide application across natural and social sciences alike. Yet current solutions are brittle, and typically rely on symbolic regression or Bayesian non-parametrics. In this work, we introduce FIM-SDE (Foundation Inference Model for SDEs), a transformer-based recognition model capable of performing accurate zero-shot estimation of the drift and diffusion functions of SDEs, from noisy and sparse observations on empirical processes of different dimensionalities. Leveraging concepts from amortized inference and neural operators, we train FIM-SDE in a supervised fashion, to map a large set of noisy and discretely observed SDE paths to their corresponding drift and diffusion functions. We demonstrate that one and the same (pretrained) FIM-SDE achieves robust zero-shot function estimation (i.e. without any parameter fine-tuning) across a wide range of synthetic and real-world processes, from canonical SDE systems (e.g. double-well dynamics or weakly perturbed Hopf bifurcations) to human motion recordings and oil price and wind speed fluctuations.

Via

Access Paper or Ask Questions

Foundation Inference Models for Markov Jump Processes

Jun 10, 2024

David Berghaus, Kostadin Cvejoski, Patrick Seifner, Cesar Ojeda, Ramses J. Sanchez

Figure 1 for Foundation Inference Models for Markov Jump Processes

Figure 2 for Foundation Inference Models for Markov Jump Processes

Figure 3 for Foundation Inference Models for Markov Jump Processes

Figure 4 for Foundation Inference Models for Markov Jump Processes

Abstract:Markov jump processes are continuous-time stochastic processes which describe dynamical systems evolving in discrete state spaces. These processes find wide application in the natural sciences and machine learning, but their inference is known to be far from trivial. In this work we introduce a methodology for zero-shot inference of Markov jump processes (MJPs), on bounded state spaces, from noisy and sparse observations, which consists of two components. First, a broad probability distribution over families of MJPs, as well as over possible observation times and noise mechanisms, with which we simulate a synthetic dataset of hidden MJPs and their noisy observation process. Second, a neural network model that processes subsets of the simulated observations, and that is trained to output the initial condition and rate matrix of the target MJP in a supervised way. We empirically demonstrate that one and the same (pretrained) model can infer, in a zero-shot fashion, hidden MJPs evolving in state spaces of different dimensionalities. Specifically, we infer MJPs which describe (i) discrete flashing ratchet systems, which are a type of Brownian motors, and the conformational dynamics in (ii) molecular simulations, (iii) experimental ion channel data and (iv) simple protein folding models. What is more, we show that our model performs on par with state-of-the-art models which are finetuned to the target datasets.

Via

Access Paper or Ask Questions

Foundational Inference Models for Dynamical Systems

Feb 12, 2024

Patrick Seifner, Kostadin Cvejoski, Ramses J. Sanchez

Figure 1 for Foundational Inference Models for Dynamical Systems

Figure 2 for Foundational Inference Models for Dynamical Systems

Figure 3 for Foundational Inference Models for Dynamical Systems

Figure 4 for Foundational Inference Models for Dynamical Systems

Abstract:Ordinary differential equations (ODEs) underlie dynamical systems which serve as models for a vast number of natural and social phenomena. Yet inferring the ODE that best describes a set of noisy observations on one such phenomenon can be remarkably challenging, and the models available to achieve it tend to be highly specialized and complex too. In this work we propose a novel supervised learning framework for zero-shot inference of ODEs from noisy data. We first generate large datasets of one-dimensional ODEs, by sampling distributions over the space of initial conditions, and the space of vector fields defining them. We then learn neural maps between noisy observations on the solutions of these equations, and their corresponding initial condition and vector fields. The resulting models, which we call foundational inference models (FIM), can be (i) copied and matched along the time dimension to increase their resolution; and (ii) copied and composed to build inference models of any dimensionality, without the need of any finetuning. We use FIM to model both ground-truth dynamical systems of different dimensionalities and empirical time series data in a zero-shot fashion, and outperform state-of-the-art models which are finetuned to these systems. Our (pretrained) FIMs are available online

Via

Access Paper or Ask Questions

Neural Dynamic Focused Topic Model

Jan 26, 2023

Kostadin Cvejoski, Ramsés J. Sánchez, César Ojeda

Abstract:Topic models and all their variants analyse text by learning meaningful representations through word co-occurrences. As pointed out by Williamson et al. (2010), such models implicitly assume that the probability of a topic to be active and its proportion within each document are positively correlated. This correlation can be strongly detrimental in the case of documents created over time, simply because recent documents are likely better described by new and hence rare topics. In this work we leverage recent advances in neural variational inference and present an alternative neural approach to the dynamic Focused Topic Model. Indeed, we develop a neural model for topic evolution which exploits sequences of Bernoulli random variables in order to track the appearances of topics, thereby decoupling their activities from their proportions. We evaluate our model on three different datasets (the UN general debates, the collection of NeurIPS papers, and the ACL Anthology dataset) and show that it (i) outperforms state-of-the-art topic models in generalization tasks and (ii) performs comparably to them on prediction tasks, while employing roughly the same number of parameters, and converging about two times faster. Source code to reproduce our experiments is available online.

* Accepted at Association for the Advancement of Artificial Intelligence (AAAI2023)

Via

Access Paper or Ask Questions

The future is different: Large pre-trained language models fail in prediction tasks

Nov 02, 2022

Kostadin Cvejoski, Ramsés J. Sánchez, César Ojeda

Abstract:Large pre-trained language models (LPLM) have shown spectacular success when fine-tuned on downstream supervised tasks. Yet, it is known that their performance can drastically drop when there is a distribution shift between the data used during training and that used at inference time. In this paper we focus on data distributions that naturally change over time and introduce four new REDDIT datasets, namely the WALLSTREETBETS, ASKSCIENCE, THE DONALD, and POLITICS sub-reddits. First, we empirically demonstrate that LPLM can display average performance drops of about 88% (in the best case!) when predicting the popularity of future posts from sub-reddits whose topic distribution changes with time. We then introduce a simple methodology that leverages neural variational dynamic topic models and attention mechanisms to infer temporal language model representations for regression tasks. Our models display performance drops of only about 40% in the worst cases (2% in the best ones) when predicting the popularity of future posts, while using only about 7% of the total number of parameters of LPLM and providing interpretable representations that offer insight into real-world events, like the GameStop short squeeze of 2021

Via

Access Paper or Ask Questions

Hidden Schema Networks

Jul 08, 2022

Ramsés J. Sánchez, Lukas Conrads, Pascal Welke, Kostadin Cvejoski, César Ojeda

Abstract:Most modern language models infer representations that, albeit powerful, lack both compositionality and semantic interpretability. Starting from the assumption that a large proportion of semantic content is necessarily relational, we introduce a neural language model that discovers networks of symbols (schemata) from text datasets. Using a variational autoencoder (VAE) framework, our model encodes sentences into sequences of symbols (composed representation), which correspond to the nodes visited by biased random walkers on a global latent graph. Sentences are then generated back, conditioned on the selected symbol sequences. We first demonstrate that the model is able to uncover ground-truth graphs from artificially generated datasets of random token sequences. Next we leverage pretrained BERT and GPT-2 language models as encoder and decoder, respectively, to train our model on language modelling tasks. Qualitatively, our results show that the model is able to infer schema networks encoding different aspects of natural language. Quantitatively, the model achieves state-of-the-art scores on VAE language modeling benchmarks. Source code to reproduce our experiments is available at https://github.com/ramsesjsf/HiddenSchemaNetworks

Via

Access Paper or Ask Questions

Informed Pre-Training on Prior Knowledge

May 23, 2022

Laura von Rueden, Sebastian Houben, Kostadin Cvejoski, Christian Bauckhage, Nico Piatkowski

Figure 1 for Informed Pre-Training on Prior Knowledge

Figure 2 for Informed Pre-Training on Prior Knowledge

Figure 3 for Informed Pre-Training on Prior Knowledge

Figure 4 for Informed Pre-Training on Prior Knowledge

Abstract:When training data is scarce, the incorporation of additional prior knowledge can assist the learning process. While it is common to initialize neural networks with weights that have been pre-trained on other large data sets, pre-training on more concise forms of knowledge has rather been overlooked. In this paper, we propose a novel informed machine learning approach and suggest to pre-train on prior knowledge. Formal knowledge representations, e.g. graphs or equations, are first transformed into a small and condensed data set of knowledge prototypes. We show that informed pre-training on such knowledge prototypes (i) speeds up the learning processes, (ii) improves generalization capabilities in the regime where not enough training data is available, and (iii) increases model robustness. Analyzing which parts of the model are affected most by the prototypes reveals that improvements come from deeper layers that typically represent high-level features. This confirms that informed pre-training can indeed transfer semantic knowledge. This is a novel effect, which shows that knowledge-based pre-training has additional and complementary strengths to existing approaches.

Via

Access Paper or Ask Questions

Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

May 10, 2022

Julian Wörmann, Daniel Bogdoll, Etienne Bührle, Han Chen, Evaristus Fuh Chuo, Kostadin Cvejoski, Ludger van Elst, Tobias Gleißner, Philip Gottschall, Stefan Griesche(+36 more)

Figure 1 for Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Figure 2 for Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Figure 3 for Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Figure 4 for Knowledge Augmented Machine Learning with Applications in Autonomous Driving: A Survey

Abstract:The existence of representative datasets is a prerequisite of many successful artificial intelligence and machine learning models. However, the subsequent application of these models often involves scenarios that are inadequately represented in the data used for training. The reasons for this are manifold and range from time and cost constraints to ethical considerations. As a consequence, the reliable use of these models, especially in safety-critical applications, is a huge challenge. Leveraging additional, already existing sources of knowledge is key to overcome the limitations of purely data-driven approaches, and eventually to increase the generalization capability of these models. Furthermore, predictions that conform with knowledge are crucial for making trustworthy and safe decisions even in underrepresented scenarios. This work provides an overview of existing techniques and methods in the literature that combine data-based models with existing knowledge. The identified approaches are structured according to the categories integration, extraction and conformity. Special attention is given to applications in the field of autonomous driving.

* 93 pages

Via

Access Paper or Ask Questions

Dynamic Review-based Recommenders

Oct 27, 2021

Kostadin Cvejoski, Ramses J. Sanchez, Christian Bauckhage, Cesar Ojeda

Figure 1 for Dynamic Review-based Recommenders

Figure 2 for Dynamic Review-based Recommenders

Figure 3 for Dynamic Review-based Recommenders

Abstract:Just as user preferences change with time, item reviews also reflect those same preference changes. In a nutshell, if one is to sequentially incorporate review content knowledge into recommender systems, one is naturally led to dynamical models of text. In the present work we leverage the known power of reviews to enhance rating predictions in a way that (i) respects the causality of review generation and (ii) includes, in a bidirectional fashion, the ability of ratings to inform language review models and vice-versa, language representations that help predict ratings end-to-end. Moreover, our representations are time-interval aware and thus yield a continuous-time representation of the dynamics. We provide experiments on real-world datasets and show that our methodology is able to outperform several state-of-the-art models. Source code for all models can be found at [1].

* 6pages, Published at International Data Science Conference 2021 (iDSC21)

Via

Access Paper or Ask Questions

Combining expert knowledge and neural networks to model environmental stresses in agriculture

Oct 26, 2021

Kostadin Cvejoski, Jannis Schuecker, Anne-Katrin Mahlein, Bogdan Georgiev

Figure 1 for Combining expert knowledge and neural networks to model environmental stresses in agriculture

Figure 2 for Combining expert knowledge and neural networks to model environmental stresses in agriculture

Figure 3 for Combining expert knowledge and neural networks to model environmental stresses in agriculture

Figure 4 for Combining expert knowledge and neural networks to model environmental stresses in agriculture

Abstract:In this work we combine representation learning capabilities of neural network with agricultural knowledge from experts to model environmental heat and drought stresses. We first design deterministic expert models which serve as a benchmark and inform the design of flexible neural-network architectures. Finally, a sensitivity analysis of the latter allows a clustering of hybrids into susceptible and resistant ones.

* 19 pages, Winners of the 2019 Syngenta Crop Challenge

Via

Access Paper or Ask Questions