Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Antonio Salmerón

How do Machine Learning Models Change?

Nov 14, 2024

Joel Castaño, Rafael Cabañas, Antonio Salmerón, David Lo, Silverio Martínez-Fernández

Figure 1 for How do Machine Learning Models Change?

Figure 2 for How do Machine Learning Models Change?

Figure 3 for How do Machine Learning Models Change?

Figure 4 for How do Machine Learning Models Change?

Abstract:The proliferation of Machine Learning (ML) models and their open-source implementations has transformed Artificial Intelligence research and applications. Platforms like Hugging Face (HF) enable the development, sharing, and deployment of these models, fostering an evolving ecosystem. While previous studies have examined aspects of models hosted on platforms like HF, a comprehensive longitudinal study of how these models change remains underexplored. This study addresses this gap by utilizing both repository mining and longitudinal analysis methods to examine over 200,000 commits and 1,200 releases from over 50,000 models on HF. We replicate and extend an ML change taxonomy for classifying commits and utilize Bayesian networks to uncover patterns in commit and release activities over time. Our findings indicate that commit activities align with established data science methodologies, such as CRISP-DM, emphasizing iterative refinement and continuous improvement. Additionally, release patterns tend to consolidate significant updates, particularly in documentation, distinguishing between granular changes and milestone-based releases. Furthermore, projects with higher popularity prioritize infrastructure enhancements early in their lifecycle, and those with intensive collaboration practices exhibit improved documentation standards. These and other insights enhance the understanding of model changes on community platforms and provide valuable guidance for best practices in model maintenance.

Via

Access Paper or Ask Questions

Counterfactual Reasoning with Probabilistic Graphical Models for Analyzing Socioecological Systems

Jan 18, 2024

Rafael Cabañas, Ana D. Maldonado, María Morales, Pedro A. Aguilera, Antonio Salmerón

Figure 1 for Counterfactual Reasoning with Probabilistic Graphical Models for Analyzing Socioecological Systems

Figure 2 for Counterfactual Reasoning with Probabilistic Graphical Models for Analyzing Socioecological Systems

Figure 3 for Counterfactual Reasoning with Probabilistic Graphical Models for Analyzing Socioecological Systems

Figure 4 for Counterfactual Reasoning with Probabilistic Graphical Models for Analyzing Socioecological Systems

Abstract:Causal and counterfactual reasoning are emerging directions in data science that allow us to reason about hypothetical scenarios. This is particularly useful in domains where experimental data are usually not available. In the context of environmental and ecological sciences, causality enables us, for example, to predict how an ecosystem would respond to hypothetical interventions. A structural causal model is a class of probabilistic graphical models for causality, which, due to its intuitive nature, can be easily understood by experts in multiple fields. However, certain queries, called unidentifiable, cannot be calculated in an exact and precise manner. This paper proposes applying a novel and recent technique for bounding unidentifiable queries within the domain of socioecological systems. Our findings indicate that traditional statistical analysis, including probabilistic graphical models, can identify the influence between variables. However, such methods do not offer insights into the nature of the relationship, specifically whether it involves necessity or sufficiency. This is where counterfactual reasoning becomes valuable.

* 34 pages

Via

Access Paper or Ask Questions

InferPy: Probabilistic Modeling with Deep Neural Networks Made Easy

Sep 04, 2019

Javier Cózar, Rafael Cabañas, Antonio Salmerón, Andrés R. Masegosa

Figure 1 for InferPy: Probabilistic Modeling with Deep Neural Networks Made Easy

Figure 2 for InferPy: Probabilistic Modeling with Deep Neural Networks Made Easy

Figure 3 for InferPy: Probabilistic Modeling with Deep Neural Networks Made Easy

Figure 4 for InferPy: Probabilistic Modeling with Deep Neural Networks Made Easy

Abstract:InferPy is a Python package for probabilistic modeling with deep neural networks. InferPy defines a user-friendly API which trades-off model complexity with ease of use, unlike other libraries whose focus is on dealing with very general probabilistic models at the cost of having a more complex API. In particular, InferPy allows to define, learn and evaluate general hierarchical probabilistic models containing deep neural networks in a compact and simple way. InferPy is built on top of Tensorflow, Edward2 and Keras.

* 5 pages limit (paper submitted to an original software publication track). This paper briefly describes a scientific software

Via

Access Paper or Ask Questions

Probabilistic Models with Deep Neural Networks

Aug 09, 2019

Andrés R. Masegosa, Rafael Cabañas, Helge Langseth, Thomas D. Nielsen, Antonio Salmerón

Figure 1 for Probabilistic Models with Deep Neural Networks

Figure 2 for Probabilistic Models with Deep Neural Networks

Figure 3 for Probabilistic Models with Deep Neural Networks

Figure 4 for Probabilistic Models with Deep Neural Networks

Abstract:Recent advances in statistical inference have significantly expanded the toolbox of probabilistic modeling. Historically, probabilistic modeling has been constrained to (i) very restricted model classes where exact or approximate probabilistic inference were feasible, and (ii) small or medium-sized data sets which fit within the main memory of the computer. However, developments in variational inference, a general form of approximate probabilistic inference originated in statistical physics, are allowing probabilistic modeling to overcome these restrictions: (i) Approximate probabilistic inference is now possible over a broad class of probabilistic models containing a large number of parameters, and (ii) scalable inference methods based on stochastic gradient descent and distributed computation engines allow to apply probabilistic modeling over massive data sets. One important practical consequence of these advances is the possibility to include deep neural networks within a probabilistic model to capture complex non-linear stochastic relationships between random variables. These advances in conjunction with the release of novel probabilistic modeling toolboxes have greatly expanded the scope of application of probabilistic models, and allow these models to take advantage of the recent strides made by the deep learning community. In this paper we review the main concepts, methods and tools needed to use deep neural networks within a probabilistic modeling framework.

Via

Access Paper or Ask Questions

AMIDST: a Java Toolbox for Scalable Probabilistic Machine Learning

Apr 04, 2017

Andrés R. Masegosa, Ana M. Martínez, Darío Ramos-López, Rafael Cabañas, Antonio Salmerón, Thomas D. Nielsen, Helge Langseth, Anders L. Madsen

Figure 1 for AMIDST: a Java Toolbox for Scalable Probabilistic Machine Learning

Figure 2 for AMIDST: a Java Toolbox for Scalable Probabilistic Machine Learning

Figure 3 for AMIDST: a Java Toolbox for Scalable Probabilistic Machine Learning

Figure 4 for AMIDST: a Java Toolbox for Scalable Probabilistic Machine Learning

Abstract:The AMIDST Toolbox is a software for scalable probabilistic machine learning with a spe- cial focus on (massive) streaming data. The toolbox supports a flexible modeling language based on probabilistic graphical models with latent variables and temporal dependencies. The specified models can be learnt from large data sets using parallel or distributed implementa- tions of Bayesian learning algorithms for either streaming or batch data. These algorithms are based on a flexible variational message passing scheme, which supports discrete and continu- ous variables from a wide range of probability distributions. AMIDST also leverages existing functionality and algorithms by interfacing to software tools such as Flink, Spark, MOA, Weka, R and HUGIN. AMIDST is an open source toolbox written in Java and available at http://www.amidsttoolbox.com under the Apache Software License version 2.0.

Via

Access Paper or Ask Questions