Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Claudio Fanconi

Towards a Cascaded LLM Framework for Cost-effective Human-AI Decision-Making

Jun 16, 2025

Claudio Fanconi, Mihaela van der Schaar

Abstract:Effective human-AI decision-making balances three key factors: the \textit{correctness} of predictions, the \textit{cost} of knowledge and reasoning complexity, and the confidence about whether to \textit{abstain} automated answers or involve human experts. In this work, we present a cascaded LLM decision framework that adaptively delegates tasks across multiple tiers of expertise -- a base model for initial candidate answers, a more capable and knowledgeable (but costlier) large model, and a human expert for when the model cascade abstains. Our method proceeds in two stages. First, a deferral policy determines whether to accept the base model's answer or regenerate it with the large model based on the confidence score. Second, an abstention policy decides whether the cascade model response is sufficiently certain or requires human intervention. Moreover, we incorporate an online learning mechanism in the framework that can leverage human feedback to improve decision quality over time. We demonstrate this approach to general question-answering (ARC-Easy and ARC-Challenge) and medical question-answering (MedQA and MedMCQA). Our results show that our cascaded strategy outperforms in most cases single-model baselines in accuracy while reducing cost and providing a principled way to handle abstentions.

Via

Access Paper or Ask Questions

Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes

Dec 18, 2024

Katarzyna Kobalczyk, Claudio Fanconi, Hao Sun, Mihaela van der Schaar

Figure 1 for Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes

Figure 2 for Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes

Figure 3 for Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes

Figure 4 for Few-shot Steerable Alignment: Adapting Rewards and LLM Policies with Neural Processes

Abstract:As large language models (LLMs) become increasingly embedded in everyday applications, ensuring their alignment with the diverse preferences of individual users has become a critical challenge. Currently deployed approaches typically assume homogeneous user objectives and rely on single-objective fine-tuning. However, human preferences are inherently heterogeneous, influenced by various unobservable factors, leading to conflicting signals in preference data. Existing solutions addressing this diversity often require costly datasets labelled for specific objectives and involve training multiple reward models or LLM policies, which is computationally expensive and impractical. In this work, we present a novel framework for few-shot steerable alignment, where users' underlying preferences are inferred from a small sample of their choices. To achieve this, we extend the Bradley-Terry-Luce model to handle heterogeneous preferences with unobserved variability factors and propose its practical implementation for reward modelling and LLM fine-tuning. Thanks to our proposed approach of functional parameter-space conditioning, LLMs trained with our framework can be adapted to individual preferences at inference time, generating outputs over a continuum of behavioural modes. We empirically validate the effectiveness of methods, demonstrating their ability to capture and align with diverse human preferences in a data-efficient manner. Our code is made available at: https://github.com/kasia-kobalczyk/few-shot-steerable-alignment.

Via

Access Paper or Ask Questions

Discovering Preference Optimization Algorithms with and for Large Language Models

Jun 12, 2024

Chris Lu, Samuel Holt, Claudio Fanconi, Alex J. Chan, Jakob Foerster, Mihaela van der Schaar, Robert Tjarko Lange

Figure 1 for Discovering Preference Optimization Algorithms with and for Large Language Models

Figure 2 for Discovering Preference Optimization Algorithms with and for Large Language Models

Figure 3 for Discovering Preference Optimization Algorithms with and for Large Language Models

Figure 4 for Discovering Preference Optimization Algorithms with and for Large Language Models

Abstract:Offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs. Typically, preference optimization is approached as an offline supervised learning task using manually-crafted convex loss functions. While these methods are based on theoretical insights, they are inherently constrained by human creativity, so the large search space of possible loss functions remains under explored. We address this by performing LLM-driven objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention. Specifically, we iteratively prompt an LLM to propose and implement new preference optimization loss functions based on previously-evaluated performance metrics. This process leads to the discovery of previously-unknown and performant preference optimization algorithms. The best performing of these we call Discovered Preference Optimization (DiscoPOP), a novel algorithm that adaptively blends logistic and exponential losses. Experiments demonstrate the state-of-the-art performance of DiscoPOP and its successful transfer to held-out tasks.

Via

Access Paper or Ask Questions

This Reads Like That: Deep Learning for Interpretable Natural Language Processing

Oct 25, 2023

Claudio Fanconi, Moritz Vandenhirtz, Severin Husmann, Julia E. Vogt

Abstract:Prototype learning, a popular machine learning method designed for inherently interpretable decisions, leverages similarities to learned prototypes for classifying new data. While it is mainly applied in computer vision, in this work, we build upon prior research and further explore the extension of prototypical networks to natural language processing. We introduce a learned weighted similarity measure that enhances the similarity computation by focusing on informative dimensions of pre-trained sentence embeddings. Additionally, we propose a post-hoc explainability mechanism that extracts prediction-relevant words from both the prototype and input sentences. Finally, we empirically demonstrate that our proposed method not only improves predictive performance on the AG News and RT Polarity datasets over a previous prototype-based approach, but also improves the faithfulness of explanations compared to rationale-based recurrent convolutions.

* 10 pages, 1 figure, 5 tables

Via

Access Paper or Ask Questions

Natural Language Processing Methods to Identify Oncology Patients at High Risk for Acute Care with Clinical Notes

Sep 28, 2022

Claudio Fanconi, Marieke van Buchem, Tina Hernandez-Boussard

Figure 1 for Natural Language Processing Methods to Identify Oncology Patients at High Risk for Acute Care with Clinical Notes

Figure 2 for Natural Language Processing Methods to Identify Oncology Patients at High Risk for Acute Care with Clinical Notes

Figure 3 for Natural Language Processing Methods to Identify Oncology Patients at High Risk for Acute Care with Clinical Notes

Figure 4 for Natural Language Processing Methods to Identify Oncology Patients at High Risk for Acute Care with Clinical Notes

Abstract:Clinical notes are an essential component of a health record. This paper evaluates how natural language processing (NLP) can be used to identify the risk of acute care use (ACU) in oncology patients, once chemotherapy starts. Risk prediction using structured health data (SHD) is now standard, but predictions using free-text formats are complex. This paper explores the use of free-text notes for the prediction of ACU instead of SHD. Deep Learning models were compared to manually engineered language features. Results show that SHD models minimally outperform NLP models; an l1-penalised logistic regression with SHD achieved a C-statistic of 0.748 (95%-CI: 0.735, 0.762), while the same model with language features achieved 0.730 (95%-CI: 0.717, 0.745) and a transformer-based model achieved 0.702 (95%-CI: 0.688, 0.717). This paper shows how language models can be used in clinical applications and underlines how risk bias is different for diverse patient groups, even using only free-text data.

* 11 pages, 6 figures, 2 tables

Via

Access Paper or Ask Questions

This Looks Like That Does it? Shortcomings of Latent Space Prototype Interpretability in Deep Networks

May 10, 2021

Adrian Hoffmann, Claudio Fanconi, Rahul Rade, Jonas Kohler

Figure 1 for This Looks Like That Does it? Shortcomings of Latent Space Prototype Interpretability in Deep Networks

Figure 2 for This Looks Like That Does it? Shortcomings of Latent Space Prototype Interpretability in Deep Networks

Figure 3 for This Looks Like That Does it? Shortcomings of Latent Space Prototype Interpretability in Deep Networks

Figure 4 for This Looks Like That Does it? Shortcomings of Latent Space Prototype Interpretability in Deep Networks

Abstract:Deep neural networks that yield human interpretable decisions by architectural design have lately become an increasingly popular alternative to post hoc interpretation of traditional black-box models. Among these networks, the arguably most widespread approach is so-called prototype learning, where similarities to learned latent prototypes serve as the basis of classifying an unseen data point. In this work, we point to an important shortcoming of such approaches. Namely, there is a semantic gap between similarity in latent space and similarity in input space, which can corrupt interpretability. We design two experiments that exemplify this issue on the so-called ProtoPNet. Specifically, we find that this network's interpretability mechanism can be led astray by intentionally crafted or even JPEG compression artefacts, which can produce incomprehensible decisions. We argue that practitioners ought to have this shortcoming in mind when deploying prototype-based models in practice.

Via

Access Paper or Ask Questions