Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ana Lucic

Information Retrieval for Climate Impact

Apr 01, 2025

Maarten de Rijke, Bart van den Hurk, Flora Salim, Alaa Al Khourdajie, Nan Bai, Renato Calzone, Declan Curran, Getnet Demil, Lesley Frew, Noah Gießing(+21 more)

Abstract:The purpose of the MANILA24 Workshop on information retrieval for climate impact was to bring together researchers from academia, industry, governments, and NGOs to identify and discuss core research problems in information retrieval to assess climate change impacts. The workshop aimed to foster collaboration by bringing communities together that have so far not been very well connected -- information retrieval, natural language processing, systematic reviews, impact assessments, and climate science. The workshop brought together a diverse set of researchers and practitioners interested in contributing to the development of a technical research agenda for information retrieval to assess climate change impacts.

* Report on the MANILA24 Workshop

Via

Access Paper or Ask Questions

Aurora: A Foundation Model of the Atmosphere

May 20, 2024

Cristian Bodnar, Wessel P. Bruinsma, Ana Lucic, Megan Stanley, Johannes Brandstetter, Patrick Garvan, Maik Riechert, Jonathan Weyn, Haiyu Dong, Anna Vaughan(+7 more)

Figure 1 for Aurora: A Foundation Model of the Atmosphere

Figure 2 for Aurora: A Foundation Model of the Atmosphere

Figure 3 for Aurora: A Foundation Model of the Atmosphere

Figure 4 for Aurora: A Foundation Model of the Atmosphere

Abstract:Deep learning foundation models are revolutionizing many facets of science by leveraging vast amounts of data to learn general-purpose representations that can be adapted to tackle diverse downstream tasks. Foundation models hold the promise to also transform our ability to model our planet and its subsystems by exploiting the vast expanse of Earth system data. Here we introduce Aurora, a large-scale foundation model of the atmosphere trained on over a million hours of diverse weather and climate data. Aurora leverages the strengths of the foundation modelling approach to produce operational forecasts for a wide variety of atmospheric prediction problems, including those with limited training data, heterogeneous variables, and extreme events. In under a minute, Aurora produces 5-day global air pollution predictions and 10-day high-resolution weather forecasts that outperform state-of-the-art classical simulation tools and the best specialized deep learning models. Taken together, these results indicate that foundation models can transform environmental forecasting.

Via

Access Paper or Ask Questions

Clifford-Steerable Convolutional Neural Networks

Feb 22, 2024

Maksim Zhdanov, David Ruhe, Maurice Weiler, Ana Lucic, Johannes Brandstetter, Patrick Forré

Figure 1 for Clifford-Steerable Convolutional Neural Networks

Figure 2 for Clifford-Steerable Convolutional Neural Networks

Figure 3 for Clifford-Steerable Convolutional Neural Networks

Figure 4 for Clifford-Steerable Convolutional Neural Networks

Abstract:We present Clifford-Steerable Convolutional Neural Networks (CS-CNNs), a novel class of $\mathrm{E}(p, q)$-equivariant CNNs. CS-CNNs process multivector fields on pseudo-Euclidean spaces $\mathbb{R}^{p,q}$. They cover, for instance, $\mathrm{E}(3)$-equivariance on $\mathbb{R}^3$ and Poincar\'e-equivariance on Minkowski spacetime $\mathbb{R}^{1,3}$. Our approach is based on an implicit parametrization of $\mathrm{O}(p,q)$-steerable kernels via Clifford group equivariant neural networks. We significantly and consistently outperform baseline methods on fluid dynamics as well as relativistic electrodynamics forecasting tasks.

Via

Access Paper or Ask Questions

Semi-Supervised Object Detection in the Open World

Jul 28, 2023

Garvita Allabadi, Ana Lucic, Peter Pao-Huang, Yu-Xiong Wang, Vikram Adve

Abstract:Existing approaches for semi-supervised object detection assume a fixed set of classes present in training and unlabeled datasets, i.e., in-distribution (ID) data. The performance of these techniques significantly degrades when these techniques are deployed in the open-world, due to the fact that the unlabeled and test data may contain objects that were not seen during training, i.e., out-of-distribution (OOD) data. The two key questions that we explore in this paper are: can we detect these OOD samples and if so, can we learn from them? With these considerations in mind, we propose the Open World Semi-supervised Detection framework (OWSSD) that effectively detects OOD data along with a semi-supervised learning pipeline that learns from both ID and OOD data. We introduce an ensemble based OOD detector consisting of lightweight auto-encoder networks trained only on ID data. Through extensive evalulation, we demonstrate that our method performs competitively against state-of-the-art OOD detection algorithms and also significantly boosts the semi-supervised learning performance in open-world scenarios.

Via

Access Paper or Ask Questions

Explaining Predictions from Machine Learning Models: Algorithms, Users, and Pedagogy

Sep 12, 2022

Ana Lucic

Figure 1 for Explaining Predictions from Machine Learning Models: Algorithms, Users, and Pedagogy

Figure 2 for Explaining Predictions from Machine Learning Models: Algorithms, Users, and Pedagogy

Figure 3 for Explaining Predictions from Machine Learning Models: Algorithms, Users, and Pedagogy

Figure 4 for Explaining Predictions from Machine Learning Models: Algorithms, Users, and Pedagogy

Abstract:Model explainability has become an important problem in machine learning (ML) due to the increased effect that algorithmic predictions have on humans. Explanations can help users understand not only why ML models make certain predictions, but also how these predictions can be changed. In this thesis, we examine the explainability of ML models from three vantage points: algorithms, users, and pedagogy, and contribute several novel solutions to the explainability problem.

* PhD thesis

Via

Access Paper or Ask Questions

Towards the Use of Saliency Maps for Explaining Low-Quality Electrocardiograms to End Users

Jul 06, 2022

Ana Lucic, Sheeraz Ahmad, Amanda Furtado Brinhosa, Vera Liao, Himani Agrawal, Umang Bhatt, Krishnaram Kenthapadi, Alice Xiang, Maarten de Rijke, Nicholas Drabowski

Figure 1 for Towards the Use of Saliency Maps for Explaining Low-Quality Electrocardiograms to End Users

Figure 2 for Towards the Use of Saliency Maps for Explaining Low-Quality Electrocardiograms to End Users

Figure 3 for Towards the Use of Saliency Maps for Explaining Low-Quality Electrocardiograms to End Users

Abstract:When using medical images for diagnosis, either by clinicians or artificial intelligence (AI) systems, it is important that the images are of high quality. When an image is of low quality, the medical exam that produced the image often needs to be redone. In telemedicine, a common problem is that the quality issue is only flagged once the patient has left the clinic, meaning they must return in order to have the exam redone. This can be especially difficult for people living in remote regions, who make up a substantial portion of the patients at Portal Telemedicina, a digital healthcare organization based in Brazil. In this paper, we report on ongoing work regarding (i) the development of an AI system for flagging and explaining low-quality medical images in real-time, (ii) an interview study to understand the explanation needs of stakeholders using the AI system at OurCompany, and, (iii) a longitudinal user study design to examine the effect of including explanations on the workflow of the technicians in our clinics. To the best of our knowledge, this would be the first longitudinal study on evaluating the effects of XAI methods on end-users -- stakeholders that use AI systems but do not have AI-specific expertise. We welcome feedback and suggestions on our experimental setup.

* Accepted to ICML 2022 Workshop on Interpretable ML in Healthcare

Via

Access Paper or Ask Questions

A Song of agreement: Evaluating the Evaluation of Explainable Artificial Intelligence in Natural Language Processing

May 09, 2022

Michael Neely, Stefan F. Schouten, Maurits Bleeker, Ana Lucic

Figure 1 for A Song of agreement: Evaluating the Evaluation of Explainable Artificial Intelligence in Natural Language Processing

Figure 2 for A Song of agreement: Evaluating the Evaluation of Explainable Artificial Intelligence in Natural Language Processing

Figure 3 for A Song of agreement: Evaluating the Evaluation of Explainable Artificial Intelligence in Natural Language Processing

Figure 4 for A Song of agreement: Evaluating the Evaluation of Explainable Artificial Intelligence in Natural Language Processing

Abstract:There has been significant debate in the NLP community about whether or not attention weights can be used as an explanation - a mechanism for interpreting how important each input token is for a particular prediction. The validity of "attention as explanation" has so far been evaluated by computing the rank correlation between attention-based explanations and existing feature attribution explanations using LSTM-based models. In our work, we (i) compare the rank correlation between five more recent feature attribution methods and two attention-based methods, on two types of NLP tasks, and (ii) extend this analysis to also include transformer-based models. We find that attention-based explanations do not correlate strongly with any recent feature attribution methods, regardless of the model or task. Furthermore, we find that none of the tested explanations correlate strongly with one another for the transformer-based model, leading us to question the underlying assumption that we should measure the validity of attention-based explanations based on how well they correlate with existing feature attribution explanation methods. After conducting experiments on five datasets using two different models, we argue that the community should stop using rank correlation as an evaluation metric for attention-based explanations. We suggest that researchers and practitioners should instead test various explanation methods and employ a human-in-the-loop process to determine if the explanations align with human intuition for the particular use case at hand.

Via

Access Paper or Ask Questions

Teaching Fairness, Accountability, Confidentiality, and Transparency in Artificial Intelligence through the Lens of Reproducibility

Nov 09, 2021

Ana Lucic, Maurits Bleeker, Sami Jullien, Samarth Bhargav, Maarten de Rijke

Figure 1 for Teaching Fairness, Accountability, Confidentiality, and Transparency in Artificial Intelligence through the Lens of Reproducibility

Abstract:In this work we explain the setup for a technical, graduate-level course on Fairness, Accountability, Confidentiality and Transparency in Artificial Intelligence (FACT-AI) at the University of Amsterdam, which teaches FACT-AI concepts through the lens of reproducibility. The focal point of the course is a group project based on reproducing existing FACT-AI algorithms from top AI conferences, and writing a report about their experiences. In the first iteration of the course, we created an open source repository with the code implementations from the group projects. In the second iteration, we encouraged students to submit their group projects to the Machine Learning Reproducibility Challenge, which resulted in 9 reports from our course being accepted to the challenge. We reflect on our experience teaching the course over two academic years, where one year coincided with a global pandemic, and propose guidelines for teaching FACT-AI through reproducibility in graduate-level AI programs. We hope this can be a useful resource for instructors to set up similar courses at their universities in the future.

* Preprint

Via

Access Paper or Ask Questions

Order in the Court: Explainable AI Methods Prone to Disagreement

May 07, 2021

Michael Neely, Stefan F. Schouten, Maurits J. R. Bleeker, Ana Lucic

Figure 1 for Order in the Court: Explainable AI Methods Prone to Disagreement

Figure 2 for Order in the Court: Explainable AI Methods Prone to Disagreement

Abstract:In Natural Language Processing, feature-additive explanation methods quantify the independent contribution of each input token towards a model's decision. By computing the rank correlation between attention weights and the scores produced by a small sample of these methods, previous analyses have sought to either invalidate or support the role of attention-based explanations as a faithful and plausible measure of salience. To investigate what measures of rank correlation can reliably conclude, we comprehensively compare feature-additive methods, including attention-based explanations, across several neural architectures and tasks. In most cases, we find that none of our chosen methods agree. Therefore, we argue that rank correlation is largely uninformative and does not measure the quality of feature-additive methods. Additionally, the range of conclusions a practitioner may draw from a single explainability algorithm are limited.

Via

Access Paper or Ask Questions

To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions

Apr 14, 2021

Kim de Bie, Ana Lucic, Hinda Haned

Figure 1 for To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions

Figure 2 for To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions

Figure 3 for To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions

Figure 4 for To Trust or Not to Trust a Regressor: Estimating and Explaining Trustworthiness of Regression Predictions

Abstract:In hybrid human-AI systems, users need to decide whether or not to trust an algorithmic prediction while the true error in the prediction is unknown. To accommodate such settings, we introduce RETRO-VIZ, a method for (i) estimating and (ii) explaining trustworthiness of regression predictions. It consists of RETRO, a quantitative estimate of the trustworthiness of a prediction, and VIZ, a visual explanation that helps users identify the reasons for the (lack of) trustworthiness of a prediction. We find that RETRO-scores negatively correlate with prediction error across 117 experimental settings, indicating that RETRO provides a useful measure to distinguish trustworthy predictions from untrustworthy ones. In a user study with 41 participants, we find that VIZ-explanations help users identify whether a prediction is trustworthy or not: on average, 95.1% of participants correctly select the more trustworthy prediction, given a pair of predictions. In addition, an average of 75.6% of participants can accurately describe why a prediction seems to be (not) trustworthy. Finally, we find that the vast majority of users subjectively experience RETRO-VIZ as a useful tool to assess the trustworthiness of algorithmic predictions.

Via

Access Paper or Ask Questions