Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Massimiliano Ciaramita

Assessing Large Language Models on Climate Information

Oct 04, 2023

Jannis Bulian, Mike S. Schäfer, Afra Amini, Heidi Lam, Massimiliano Ciaramita, Ben Gaiarin, Michelle Chen Huebscher, Christian Buck, Niels Mede, Markus Leippold(+1 more)

Abstract:Understanding how climate change affects us and learning about available solutions are key steps toward empowering individuals and communities to mitigate and adapt to it. As Large Language Models (LLMs) rise in popularity, it is necessary to assess their capability in this domain. In this study, we present a comprehensive evaluation framework, grounded in science communication principles, to analyze LLM responses to climate change topics. Our framework emphasizes both the presentational and epistemological adequacy of answers, offering a fine-grained analysis of LLM generations. Spanning 8 dimensions, our framework discerns up to 30 distinct issues in model outputs. The task is a real-world example of a growing number of challenging problems where AI can complement and lift human performance. We introduce a novel and practical protocol for scalable oversight that uses AI Assistance and relies on raters with relevant educational backgrounds. We evaluate several recent LLMs and conduct a comprehensive analysis of the results, shedding light on both the potential and the limitations of LLMs in the realm of climate communication.

Via

Access Paper or Ask Questions

Probing in Context: Toward Building Robust Classifiers via Probing Large Language Models

May 23, 2023

Afra Amini, Massimiliano Ciaramita

Abstract:Large language models are able to learn new tasks in context, where they are provided with instructions and a few annotated examples. However, the effectiveness of in-context learning is dependent to the provided context, and the performance on a downstream task can vary a lot depending on the instruction. Importantly, such dependency on the context can happen in unpredictable ways, e.g., a seemingly more informative instruction might lead to a worse performance. In this paper, we propose an alternative approach, which we term in-context probing. Similar to in-context learning, we contextualize the representation of the input with an instruction, but instead of decoding the output prediction, we probe the contextualized representation to predict the label. Through a series of experiments on a diverse set of classification tasks, we show that in-context probing is significantly more robust to changes in instructions. We further show that probing can be particularly helpful to build classifiers on top of smaller models, and with only a hundred training examples.

Via

Access Paper or Ask Questions

Zero-Shot Retrieval with Search Agents and Hybrid Environments

Sep 30, 2022

Michelle Chen Huebscher, Christian Buck, Massimiliano Ciaramita, Sascha Rothe

Figure 1 for Zero-Shot Retrieval with Search Agents and Hybrid Environments

Figure 2 for Zero-Shot Retrieval with Search Agents and Hybrid Environments

Figure 3 for Zero-Shot Retrieval with Search Agents and Hybrid Environments

Figure 4 for Zero-Shot Retrieval with Search Agents and Hybrid Environments

Abstract:Learning to search is the task of building artificial agents that learn to autonomously use a search box to find information. So far, it has been shown that current language models can learn symbolic query reformulation policies, in combination with traditional term-based retrieval, but fall short of outperforming neural retrievers. We extend the previous learning to search setup to a hybrid environment, which accepts discrete query refinement operations, after a first-pass retrieval step performed by a dual encoder. Experiments on the BEIR task show that search agents, trained via behavioral cloning, outperform the underlying search system based on a combined dual encoder retriever and cross encoder reranker. Furthermore, we find that simple heuristic Hybrid Retrieval Environments (HRE) can improve baseline performance by several nDCG points. The search agent based on HRE (HARE) produces state-of-the-art performance on both zero-shot and in-domain evaluations. We carry out an extensive qualitative analysis to shed light on the agents policies.

Via

Access Paper or Ask Questions

Boosting Search Engines with Interactive Agents

Sep 01, 2021

Leonard Adolphs, Benjamin Boerschinger, Christian Buck, Michelle Chen Huebscher, Massimiliano Ciaramita, Lasse Espeholt, Thomas Hofmann, Yannic Kilcher

Figure 1 for Boosting Search Engines with Interactive Agents

Figure 2 for Boosting Search Engines with Interactive Agents

Figure 3 for Boosting Search Engines with Interactive Agents

Figure 4 for Boosting Search Engines with Interactive Agents

Abstract:Can machines learn to use a search engine as an interactive tool for finding information? That would have far reaching consequences for making the world's knowledge more accessible. This paper presents first steps in designing agents that learn meta-strategies for contextual query refinements. Our approach uses machine reading to guide the selection of refinement terms from aggregated search results. Agents are then empowered with simple but effective search operators to exert fine-grained and transparent control over queries and search results. We develop a novel way of generating synthetic search sessions, which leverages the power of transformer-based generative language models through (self-)supervised learning. We also present a reinforcement learning agent with dynamically constrained actions that can learn interactive search strategies completely from scratch. In both cases, we obtain significant improvements over one-shot search with a strong information retrieval baseline. Finally, we provide an in-depth analysis of the learned search policies.

Via

Access Paper or Ask Questions

ClimaText: A Dataset for Climate Change Topic Detection

Jan 02, 2021

Francesco S. Varini, Jordan Boyd-Graber, Massimiliano Ciaramita, Markus Leippold

Figure 1 for ClimaText: A Dataset for Climate Change Topic Detection

Figure 2 for ClimaText: A Dataset for Climate Change Topic Detection

Figure 3 for ClimaText: A Dataset for Climate Change Topic Detection

Figure 4 for ClimaText: A Dataset for Climate Change Topic Detection

Abstract:Climate change communication in the mass media and other textual sources may affect and shape public perception. Extracting climate change information from these sources is an important task, e.g., for filtering content and e-discovery, sentiment analysis, automatic summarization, question-answering, and fact-checking. However, automating this process is a challenge, as climate change is a complex, fast-moving, and often ambiguous topic with scarce resources for popular text-based AI tasks. In this paper, we introduce \textsc{ClimaText}, a dataset for sentence-based climate change topic detection, which we make publicly available. We explore different approaches to identify the climate change topic in various text sources. We find that popular keyword-based models are not adequate for such a complex and evolving task. Context-based algorithms like BERT \cite{devlin2018bert} can detect, in addition to many trivial cases, a variety of complex and implicit topic patterns. Nevertheless, our analysis reveals a great potential for improvement in several directions, such as, e.g., capturing the discussion on indirect effects of climate change. Hence, we hope this work can serve as a good starting point for further research on this topic.

* Accepted for the Tackling Climate Change with Machine Learning Workshop at NeurIPS 2020

Via

Access Paper or Ask Questions

CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims

Jan 02, 2021

Thomas Diggelmann, Jordan Boyd-Graber, Jannis Bulian, Massimiliano Ciaramita, Markus Leippold

Figure 1 for CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims

Figure 2 for CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims

Figure 3 for CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims

Figure 4 for CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims

Abstract:We introduce CLIMATE-FEVER, a new publicly available dataset for verification of climate change-related claims. By providing a dataset for the research community, we aim to facilitate and encourage work on improving algorithms for retrieving evidential support for climate-specific claims, addressing the underlying language understanding challenges, and ultimately help alleviate the impact of misinformation on climate change. We adapt the methodology of FEVER [1], the largest dataset of artificially designed claims, to real-life claims collected from the Internet. While during this process, we could rely on the expertise of renowned climate scientists, it turned out to be no easy task. We discuss the surprising, subtle complexity of modeling real-world climate-related claims within the \textsc{fever} framework, which we believe provides a valuable challenge for general natural language understanding. We hope that our work will mark the beginning of a new exciting long-term joint effort by the climate science and AI community.

* Accepted for the Tackling Climate Change with Machine Learning Workshop at NeurIPS 2020

Via

Access Paper or Ask Questions

Meta Answering for Machine Reading

Nov 11, 2019

Benjamin Borschinger, Jordan Boyd-Graber, Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Michelle Chen Huebscher, Wojciech Gajewski, Yannic Kilcher, Rodrigo Nogueira, Lierni Sestorain Saralegu

Figure 1 for Meta Answering for Machine Reading

Figure 2 for Meta Answering for Machine Reading

Figure 3 for Meta Answering for Machine Reading

Figure 4 for Meta Answering for Machine Reading

Abstract:We investigate a framework for machine reading, inspired by real world information-seeking problems, where a meta question answering system interacts with a black box environment. The environment encapsulates a competitive machine reader based on BERT, providing candidate answers to questions, and possibly some context. To validate the realism of our formulation, we ask humans to play the role of a meta-answerer. With just a small snippet of text around an answer, humans can outperform the machine reader, improving recall. Similarly, a simple machine meta-answerer outperforms the environment, improving both precision and recall on the Natural Questions dataset. The system relies on joint training of answer scoring and the selection of conditioning information.

Via

Access Paper or Ask Questions

Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Sep 27, 2018

Rodrigo Nogueira, Jannis Bulian, Massimiliano Ciaramita

Figure 1 for Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Figure 2 for Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Figure 3 for Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Figure 4 for Learning to Coordinate Multiple Reinforcement Learning Agents for Diverse Query Reformulation

Abstract:We propose a method to efficiently learn diverse strategies in reinforcement learning for query reformulation in the tasks of document retrieval and question answering. In the proposed framework an agent consists of multiple specialized sub-agents and a meta-agent that learns to aggregate the answers from sub-agents to produce a final answer. Sub-agents are trained on disjoint partitions of the training data, while the meta-agent is trained on the full training set. Our method makes learning faster, because it is highly parallelizable, and has better generalization performance than strong baselines, such as an ensemble of agents trained on the full data. We show that the improved performance is due to the increased diversity of reformulation strategies.

Via

Access Paper or Ask Questions

Zero-Shot Dual Machine Translation

May 25, 2018

Lierni Sestorain, Massimiliano Ciaramita, Christian Buck, Thomas Hofmann

Figure 1 for Zero-Shot Dual Machine Translation

Figure 2 for Zero-Shot Dual Machine Translation

Figure 3 for Zero-Shot Dual Machine Translation

Figure 4 for Zero-Shot Dual Machine Translation

Abstract:Neural Machine Translation (NMT) systems rely on large amounts of parallel data. This is a major challenge for low-resource languages. Building on recent work on unsupervised and semi-supervised methods, we present an approach that combines zero-shot and dual learning. The latter relies on reinforcement learning, to exploit the duality of the machine translation task, and requires only monolingual data for the target language pair. Experiments show that a zero-shot dual system, trained on English-French and English-Spanish, outperforms by large margins a standard NMT system in zero-shot translation performance on Spanish-French (both directions). The zero-shot dual method approaches the performance, within 2.2 BLEU points, of a comparable supervised setting. Our method can obtain improvements also on the setting where a small amount of parallel data for the zero-shot language pair is available. Adding Russian, to extend our experiments to jointly modeling 6 zero-shot translation directions, all directions improve between 4 and 15 BLEU points, again, reaching performance near that of the supervised setting.

Via

Access Paper or Ask Questions

Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

Mar 02, 2018

Christian Buck, Jannis Bulian, Massimiliano Ciaramita, Wojciech Gajewski, Andrea Gesmundo, Neil Houlsby, Wei Wang

Figure 1 for Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

Figure 2 for Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

Figure 3 for Ask the Right Questions: Active Question Reformulation with Reinforcement Learning

Abstract:We frame Question Answering (QA) as a Reinforcement Learning task, an approach that we call Active Question Answering. We propose an agent that sits between the user and a black box QA system and learns to reformulate questions to elicit the best possible answers. The agent probes the system with, potentially many, natural language reformulations of an initial question and aggregates the returned evidence to yield the best answer. The reformulation system is trained end-to-end to maximize answer quality using policy gradient. We evaluate on SearchQA, a dataset of complex questions extracted from Jeopardy!. The agent outperforms a state-of-the-art base model, playing the role of the environment, and other benchmarks. We also analyze the language that the agent has learned while interacting with the question answering system. We find that successful question reformulations look quite different from natural language paraphrases. The agent is able to discover non-trivial reformulation strategies that resemble classic information retrieval techniques such as term re-weighting (tf-idf) and stemming.

* Sixth International Conference on Learning Representations (ICLR), 2018

Via

Access Paper or Ask Questions