Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jože M. Rožanec

Causal Cartographer: From Mapping to Reasoning Over Counterfactual Worlds

May 20, 2025

Gaël Gendron, Jože M. Rožanec, Michael Witbrock, Gillian Dobbie

Abstract:Causal world models are systems that can answer counterfactual questions about an environment of interest, i.e. predict how it would have evolved if an arbitrary subset of events had been realized differently. It requires understanding the underlying causes behind chains of events and conducting causal inference for arbitrary unseen distributions. So far, this task eludes foundation models, notably large language models (LLMs), which do not have demonstrated causal reasoning capabilities beyond the memorization of existing causal relationships. Furthermore, evaluating counterfactuals in real-world applications is challenging since only the factual world is observed, limiting evaluation to synthetic datasets. We address these problems by explicitly extracting and modeling causal relationships and propose the Causal Cartographer framework. First, we introduce a graph retrieval-augmented generation agent tasked to retrieve causal relationships from data. This approach allows us to construct a large network of real-world causal relationships that can serve as a repository of causal knowledge and build real-world counterfactuals. In addition, we create a counterfactual reasoning agent constrained by causal relationships to perform reliable step-by-step causal inference. We show that our approach can extract causal knowledge and improve the robustness of LLMs for causal reasoning tasks while reducing inference costs and spurious correlations.

* 29 pages, 9 pages for the main paper, 20 pages for the references and appendix, 25 figures

Via

Access Paper or Ask Questions

Counterfactual Causal Inference in Natural Language with Large Language Models

Oct 08, 2024

Gaël Gendron, Jože M. Rožanec, Michael Witbrock, Gillian Dobbie

Figure 1 for Counterfactual Causal Inference in Natural Language with Large Language Models

Figure 2 for Counterfactual Causal Inference in Natural Language with Large Language Models

Figure 3 for Counterfactual Causal Inference in Natural Language with Large Language Models

Figure 4 for Counterfactual Causal Inference in Natural Language with Large Language Models

Abstract:Causal structure discovery methods are commonly applied to structured data where the causal variables are known and where statistical testing can be used to assess the causal relationships. By contrast, recovering a causal structure from unstructured natural language data such as news articles contains numerous challenges due to the absence of known variables or counterfactual data to estimate the causal links. Large Language Models (LLMs) have shown promising results in this direction but also exhibit limitations. This work investigates LLM's abilities to build causal graphs from text documents and perform counterfactual causal inference. We propose an end-to-end causal structure discovery and causal inference method from natural language: we first use an LLM to extract the instantiated causal variables from text data and build a causal graph. We merge causal graphs from multiple data sources to represent the most exhaustive set of causes possible. We then conduct counterfactual inference on the estimated graph. The causal graph conditioning allows reduction of LLM biases and better represents the causal estimands. We use our method to show that the limitations of LLMs in counterfactual causal reasoning come from prediction errors and propose directions to mitigate them. We demonstrate the applicability of our method on real-world news articles.

* 22 pages, 10 pages for the main paper, 12 pages for the references and appendix, 5 figures

Via

Access Paper or Ask Questions

Dealing with zero-inflated data: achieving SOTA with a two-fold machine learning approach

Oct 12, 2023

Jože M. Rožanec, Gašper Petelin, João Costa, Blaž Bertalanič, Gregor Cerar, Marko Guček, Gregor Papa, Dunja Mladenić

Abstract:In many cases, a machine learning model must learn to correctly predict a few data points with particular values of interest in a broader range of data where many target values are zero. Zero-inflated data can be found in diverse scenarios, such as lumpy and intermittent demands, power consumption for home appliances being turned on and off, impurities measurement in distillation processes, and even airport shuttle demand prediction. The presence of zeroes affects the models' learning and may result in poor performance. Furthermore, zeroes also distort the metrics used to compute the model's prediction quality. This paper showcases two real-world use cases (home appliances classification and airport shuttle demand prediction) where a hierarchical model applied in the context of zero-inflated data leads to excellent results. In particular, for home appliances classification, the weighted average of Precision, Recall, F1, and AUC ROC was increased by 27%, 34%, 49%, and 27%, respectively. Furthermore, it is estimated that the proposed approach is also four times more energy efficient than the SOTA approach against which it was compared to. Two-fold models performed best in all cases when predicting airport shuttle demand, and the difference against other models has been proven to be statistically significant.

Via

Access Paper or Ask Questions

Worker Activity Recognition in Manufacturing Line Using Near-body Electric Field

Aug 07, 2023

Sungho Suh, Vitor Fortes Rey, Sizhen Bian, Yu-Chi Huang, Jože M. Rožanec, Hooman Tavakoli Ghinani, Bo Zhou, Paul Lukowicz

Figure 1 for Worker Activity Recognition in Manufacturing Line Using Near-body Electric Field

Figure 2 for Worker Activity Recognition in Manufacturing Line Using Near-body Electric Field

Figure 3 for Worker Activity Recognition in Manufacturing Line Using Near-body Electric Field

Figure 4 for Worker Activity Recognition in Manufacturing Line Using Near-body Electric Field

Abstract:Manufacturing industries strive to improve production efficiency and product quality by deploying advanced sensing and control systems. Wearable sensors are emerging as a promising solution for achieving this goal, as they can provide continuous and unobtrusive monitoring of workers' activities in the manufacturing line. This paper presents a novel wearable sensing prototype that combines IMU and body capacitance sensing modules to recognize worker activities in the manufacturing line. To handle these multimodal sensor data, we propose and compare early, and late sensor data fusion approaches for multi-channel time-series convolutional neural networks and deep convolutional LSTM. We evaluate the proposed hardware and neural network model by collecting and annotating sensor data using the proposed sensing prototype and Apple Watches in the testbed of the manufacturing line. Experimental results demonstrate that our proposed methods achieve superior performance compared to the baseline methods, indicating the potential of the proposed approach for real-world applications in manufacturing industries. Furthermore, the proposed sensing prototype with a body capacitive sensor and feature fusion method improves by 6.35%, yielding a 9.38% higher macro F1 score than the proposed sensing prototype without a body capacitive sensor and Apple Watch data, respectively.

Via

Access Paper or Ask Questions

Human in the AI loop via xAI and Active Learning for Visual Inspection

Jul 17, 2023

Jože M. Rožanec, Elias Montini, Vincenzo Cutrona, Dimitrios Papamartzivanos, Timotej Klemenčič, Blaž Fortuna, Dunja Mladenić, Entso Veliou, Thanassis Giannetsos, Christos Emmanouilidis

Abstract:Industrial revolutions have historically disrupted manufacturing by introducing automation into production. Increasing automation reshapes the role of the human worker. Advances in robotics and artificial intelligence open new frontiers of human-machine collaboration. Such collaboration can be realized considering two sub-fields of artificial intelligence: active learning and explainable artificial intelligence. Active learning aims to devise strategies that help obtain data that allows machine learning algorithms to learn better. On the other hand, explainable artificial intelligence aims to make the machine learning models intelligible to the human person. The present work first describes Industry 5.0, human-machine collaboration, and state-of-the-art regarding quality inspection, emphasizing visual inspection. Then it outlines how human-machine collaboration could be realized and enhanced in visual inspection. Finally, some of the results obtained in the EU H2020 STAR project regarding visual inspection are shared, considering artificial intelligence, human digital twins, and cybersecurity.

Via

Access Paper or Ask Questions

Synthetic Data Augmentation Using GAN For Improved Automated Visual Inspection

Dec 19, 2022

Jože M. Rožanec, Patrik Zajec, Spyros Theodoropoulos, Erik Koehorst, Blaž Fortuna, Dunja Mladenić

Figure 1 for Synthetic Data Augmentation Using GAN For Improved Automated Visual Inspection

Abstract:Quality control is a crucial activity performed by manufacturing companies to ensure their products conform to the requirements and specifications. The introduction of artificial intelligence models enables to automate the visual quality inspection, speeding up the inspection process and ensuring all products are evaluated under the same criteria. In this research, we compare supervised and unsupervised defect detection techniques and explore data augmentation techniques to mitigate the data imbalance in the context of automated visual inspection. Furthermore, we use Generative Adversarial Networks for data augmentation to enhance the classifiers' discriminative performance. Our results show that state-of-the-art unsupervised defect detection does not match the performance of supervised models but can be used to reduce the labeling workload by more than 50%. Furthermore, the best classification performance was achieved considering GAN-based data generation with AUC ROC scores equal to or higher than 0,9898, even when increasing the dataset imbalance by leaving only 25\% of the images denoting defective products. We performed the research with real-world data provided by Philips Consumer Lifestyle BV.

Via

Access Paper or Ask Questions

Robust Anomaly Map Assisted Multiple Defect Detection with Supervised Classification Techniques

Dec 19, 2022

Jože M. Rožanec, Patrik Zajec, Spyros Theodoropoulos, Erik Koehorst, Blaž Fortuna, Dunja Mladenić

Figure 1 for Robust Anomaly Map Assisted Multiple Defect Detection with Supervised Classification Techniques

Figure 2 for Robust Anomaly Map Assisted Multiple Defect Detection with Supervised Classification Techniques

Figure 3 for Robust Anomaly Map Assisted Multiple Defect Detection with Supervised Classification Techniques

Figure 4 for Robust Anomaly Map Assisted Multiple Defect Detection with Supervised Classification Techniques

Abstract:Industry 4.0 aims to optimize the manufacturing environment by leveraging new technological advances, such as new sensing capabilities and artificial intelligence. The DRAEM technique has shown state-of-the-art performance for unsupervised classification. The ability to create anomaly maps highlighting areas where defects probably lie can be leveraged to provide cues to supervised classification models and enhance their performance. Our research shows that the best performance is achieved when training a defect detection model by providing an image and the corresponding anomaly map as input. Furthermore, such a setting provides consistent performance when framing the defect detection as a binary or multiclass classification problem and is not affected by class balancing policies. We performed the experiments on three datasets with real-world data provided by Philips Consumer Lifestyle BV.

Via

Access Paper or Ask Questions

Machine Beats Machine: Machine Learning Models to Defend Against Adversarial Attacks

Sep 28, 2022

Jože M. Rožanec, Dimitrios Papamartzivanos, Entso Veliou, Theodora Anastasiou, Jelle Keizer, Blaž Fortuna, Dunja Mladenić

Figure 1 for Machine Beats Machine: Machine Learning Models to Defend Against Adversarial Attacks

Figure 2 for Machine Beats Machine: Machine Learning Models to Defend Against Adversarial Attacks

Figure 3 for Machine Beats Machine: Machine Learning Models to Defend Against Adversarial Attacks

Figure 4 for Machine Beats Machine: Machine Learning Models to Defend Against Adversarial Attacks

Abstract:We propose using a two-layered deployment of machine learning models to prevent adversarial attacks. The first layer determines whether the data was tampered, while the second layer solves a domain-specific problem. We explore three sets of features and three dataset variations to train machine learning models. Our results show clustering algorithms achieved promising results. In particular, we consider the best results were obtained by applying the DBSCAN algorithm to the structured structural similarity index measure computed between the images and a white reference image.

Via

Access Paper or Ask Questions

Forecasting Sensor Values in Waste-To-Fuel Plants: a Case Study

Sep 28, 2022

Bor Brecelj, Beno Šircelj, Jože M. Rožanec, Blaž Fortuna, Dunja Mladenić

Figure 1 for Forecasting Sensor Values in Waste-To-Fuel Plants: a Case Study

Figure 2 for Forecasting Sensor Values in Waste-To-Fuel Plants: a Case Study

Figure 3 for Forecasting Sensor Values in Waste-To-Fuel Plants: a Case Study

Figure 4 for Forecasting Sensor Values in Waste-To-Fuel Plants: a Case Study

Abstract:In this research, we develop machine learning models to predict future sensor readings of a waste-to-fuel plant, which would enable proactive control of the plant's operations. We developed models that predict sensor readings for 30 and 60 minutes into the future. The models were trained using historical data, and predictions were made based on sensor readings taken at a specific time. We compare three types of models: (a) a n\"aive prediction that considers only the last predicted value, (b) neural networks that make predictions based on past sensor data (we consider different time window sizes for making a prediction), and (c) a gradient boosted tree regressor created with a set of features that we developed. We developed and tested our models on a real-world use case at a waste-to-fuel plant in Canada. We found that approach (c) provided the best results, while approach (b) provided mixed results and was not able to outperform the n\"aive consistently.

Via

Access Paper or Ask Questions

Active Learning and Approximate Model Calibration for Automated Visual Inspection in Manufacturing

Sep 12, 2022

Jože M. Rožanec, Luka Bizjak, Elena Trajkova, Patrik Zajec, Jelle Keizer, Blaž Fortuna, Dunja Mladenić

Figure 1 for Active Learning and Approximate Model Calibration for Automated Visual Inspection in Manufacturing

Figure 2 for Active Learning and Approximate Model Calibration for Automated Visual Inspection in Manufacturing

Figure 3 for Active Learning and Approximate Model Calibration for Automated Visual Inspection in Manufacturing

Figure 4 for Active Learning and Approximate Model Calibration for Automated Visual Inspection in Manufacturing

Abstract:Quality control is a crucial activity performed by manufacturing enterprises to ensure that their products meet quality standards and avoid potential damage to the brand's reputation. The decreased cost of sensors and connectivity enabled increasing digitalization of manufacturing. In addition, artificial intelligence enables higher degrees of automation, reducing overall costs and time required for defect inspection. This research compares three active learning approaches (with single and multiple oracles) to visual inspection. We propose a novel approach to probabilities calibration of classification models and two new metrics to assess the performance of the calibration without the need for ground truth. We performed experiments on real-world data provided by Philips Consumer Lifestyle BV. Our results show that explored active learning settings can reduce the data labeling effort by between three and four percent without detriment to the overall quality goals, considering a threshold of p=0.95. Furthermore, we show that the proposed metrics successfully capture relevant information otherwise available to metrics used up to date only through ground truth data. Therefore, the proposed metrics can be used to estimate the quality of models' probability calibration without committing to a labeling effort to obtain ground truth data.

Via

Access Paper or Ask Questions