Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Gianluca Demartini

How Do Experts Make Sense of Integrated Process Models?

May 28, 2025

Tianwa Chen, Barbara Weber, Graeme Shanks, Gianluca Demartini, Marta Indulska, Shazia Sadiq

Abstract:A range of integrated modeling approaches have been developed to enable a holistic representation of business process logic together with all relevant business rules. These approaches address inherent problems with separate documentation of business process models and business rules. In this study, we explore how expert process workers make sense of the information provided through such integrated modeling approaches. To do so, we complement verbal protocol analysis with eye-tracking metrics to reveal nuanced user behaviours involved in the main phases of sensemaking, namely information foraging and information processing. By studying expert process workers engaged in tasks based on integrated modeling of business processes and rules, we provide insights that pave the way for a better understanding of sensemaking practices and improved development of business process and business rule integration approaches. Our research underscores the importance of offering personalized support mechanisms that increase the efficacy and efficiency of sensemaking practices for process knowledge workers.

Via

Access Paper or Ask Questions

LLM-based Semantic Augmentation for Harmful Content Detection

Apr 22, 2025

Elyas Meguellati, Assaad Zeghina, Shazia Sadiq, Gianluca Demartini

Abstract:Recent advances in large language models (LLMs) have demonstrated strong performance on simple text classification tasks, frequently under zero-shot settings. However, their efficacy declines when tackling complex social media challenges such as propaganda detection, hateful meme classification, and toxicity identification. Much of the existing work has focused on using LLMs to generate synthetic training data, overlooking the potential of LLM-based text preprocessing and semantic augmentation. In this paper, we introduce an approach that prompts LLMs to clean noisy text and provide context-rich explanations, thereby enhancing training sets without substantial increases in data volume. We systematically evaluate on the SemEval 2024 multi-label Persuasive Meme dataset and further validate on the Google Jigsaw toxic comments and Facebook hateful memes datasets to assess generalizability. Our results reveal that zero-shot LLM classification underperforms on these high-context tasks compared to supervised models. In contrast, integrating LLM-based semantic augmentation yields performance on par with approaches that rely on human-annotated data, at a fraction of the cost. These findings underscore the importance of strategically incorporating LLMs into machine learning (ML) pipeline for social media classification tasks, offering broad implications for combating harmful content online.

Via

Access Paper or Ask Questions

Spotting Persuasion: A Low-cost Model for Persuasion Detection in Political Ads on Social Media

Mar 18, 2025

Elyas Meguellati, Stefano Civelli, Pietro Bernardelle, Shazia Sadiq, Gianluca Demartini

Abstract:In the realm of political advertising, persuasion operates as a pivotal element within the broader framework of propaganda, exerting profound influences on public opinion and electoral outcomes. In this paper, we (1) introduce a lightweight model for persuasive text detection that achieves state-of-the-art performance in Subtask 3 of SemEval 2023 Task 3, while significantly reducing the computational resource requirements; and (2) leverage the proposed model to gain insights into political campaigning strategies on social media platforms by applying it to a real-world dataset we curated, consisting of Facebook political ads from the 2022 Australian Federal election campaign. Our study shows how subtleties can be found in persuasive political advertisements and presents a pragmatic approach to detect and analyze such strategies with limited resources, enhancing transparency in social media political campaigns.

Via

Access Paper or Ask Questions

Leveraging Semantic Type Dependencies for Clinical Named Entity Recognition

Mar 07, 2025

Linh Le, Guido Zuccon, Gianluca Demartini, Genghong Zhao, Xia Zhang

Figure 1 for Leveraging Semantic Type Dependencies for Clinical Named Entity Recognition

Figure 2 for Leveraging Semantic Type Dependencies for Clinical Named Entity Recognition

Figure 3 for Leveraging Semantic Type Dependencies for Clinical Named Entity Recognition

Figure 4 for Leveraging Semantic Type Dependencies for Clinical Named Entity Recognition

Abstract:Previous work on clinical relation extraction from free-text sentences leveraged information about semantic types from clinical knowledge bases as a part of entity representations. In this paper, we exploit additional evidence by also making use of domain-specific semantic type dependencies. We encode the relation between a span of tokens matching a Unified Medical Language System (UMLS) concept and other tokens in the sentence. We implement our method and compare against different named entity recognition (NER) architectures (i.e., BiLSTM-CRF and BiLSTM-GCN-CRF) using different pre-trained clinical embeddings (i.e., BERT, BioBERT, UMLSBert). Our experimental results on clinical datasets show that in some cases NER effectiveness can be significantly improved by making use of domain-specific semantic type dependencies. Our work is also the first study generating a matrix encoding to make use of more than three dependencies in one pass for the NER task.

* AMIA - American Medical Informatics Association 2022

Via

Access Paper or Ask Questions

Are Large Language Models Good Data Preprocessors?

Feb 24, 2025

Elyas Meguellati, Nardiena Pratama, Shazia Sadiq, Gianluca Demartini

Abstract:High-quality textual training data is essential for the success of multimodal data processing tasks, yet outputs from image captioning models like BLIP and GIT often contain errors and anomalies that are difficult to rectify using rule-based methods. While recent work addressing this issue has predominantly focused on using GPT models for data preprocessing on relatively simple public datasets, there is a need to explore a broader range of Large Language Models (LLMs) and tackle more challenging and diverse datasets. In this study, we investigate the use of multiple LLMs, including LLaMA 3.1 70B, GPT-4 Turbo, and Sonnet 3.5 v2, to refine and clean the textual outputs of BLIP and GIT. We assess the impact of LLM-assisted data cleaning by comparing downstream-task (SemEval 2024 Subtask "Multilabel Persuasion Detection in Memes") models trained on cleaned versus non-cleaned data. While our experimental results show improvements when using LLM-cleaned captions, statistical tests reveal that most of these improvements are not significant. This suggests that while LLMs have the potential to enhance data cleaning and repairing, their effectiveness may be limited depending on the context they are applied to, the complexity of the task, and the level of noise in the text. Our findings highlight the need for further research into the capabilities and limitations of LLMs in data preprocessing pipelines, especially when dealing with challenging datasets, contributing empirical evidence to the ongoing discussion about integrating LLMs into data preprocessing pipelines.

Via

Access Paper or Ask Questions

Plan-Then-Execute: An Empirical Study of User Trust and Team Performance When Using LLM Agents As A Daily Assistant

Feb 03, 2025

Gaole He, Gianluca Demartini, Ujwal Gadiraju

Figure 1 for Plan-Then-Execute: An Empirical Study of User Trust and Team Performance When Using LLM Agents As A Daily Assistant

Figure 2 for Plan-Then-Execute: An Empirical Study of User Trust and Team Performance When Using LLM Agents As A Daily Assistant

Figure 3 for Plan-Then-Execute: An Empirical Study of User Trust and Team Performance When Using LLM Agents As A Daily Assistant

Figure 4 for Plan-Then-Execute: An Empirical Study of User Trust and Team Performance When Using LLM Agents As A Daily Assistant

Abstract:Since the explosion in popularity of ChatGPT, large language models (LLMs) have continued to impact our everyday lives. Equipped with external tools that are designed for a specific purpose (e.g., for flight booking or an alarm clock), LLM agents exercise an increasing capability to assist humans in their daily work. Although LLM agents have shown a promising blueprint as daily assistants, there is a limited understanding of how they can provide daily assistance based on planning and sequential decision making capabilities. We draw inspiration from recent work that has highlighted the value of 'LLM-modulo' setups in conjunction with humans-in-the-loop for planning tasks. We conducted an empirical study (N = 248) of LLM agents as daily assistants in six commonly occurring tasks with different levels of risk typically associated with them (e.g., flight ticket booking and credit card payments). To ensure user agency and control over the LLM agent, we adopted LLM agents in a plan-then-execute manner, wherein the agents conducted step-wise planning and step-by-step execution in a simulation environment. We analyzed how user involvement at each stage affects their trust and collaborative team performance. Our findings demonstrate that LLM agents can be a double-edged sword -- (1) they can work well when a high-quality plan and necessary user involvement in execution are available, and (2) users can easily mistrust the LLM agents with plans that seem plausible. We synthesized key insights for using LLM agents as daily assistants to calibrate user trust and achieve better overall task outcomes. Our work has important implications for the future design of daily assistants and human-AI collaboration with LLM agents.

* conditionally accepted to CHI 2025

Via

Access Paper or Ask Questions

SubData: A Python Library to Collect and Combine Datasets for Evaluating LLM Alignment on Downstream Tasks

Dec 21, 2024

Leon Fröhling, Pietro Bernardelle, Gianluca Demartini

Abstract:With the release of ever more capable large language models (LLMs), researchers in NLP and related disciplines have started to explore the usability of LLMs for a wide variety of different annotation tasks. Very recently, a lot of this attention has shifted to tasks that are subjective in nature. Given that the latest generations of LLMs have digested and encoded extensive knowledge about different human subpopulations and individuals, the hope is that these models can be trained, tuned or prompted to align with a wide range of different human perspectives. While researchers already evaluate the success of this alignment via surveys and tests, there is a lack of resources to evaluate the alignment on what oftentimes matters the most in NLP; the actual downstream tasks. To fill this gap we present SubData, a Python library that offers researchers working on topics related to subjectivity in annotation tasks a convenient way of collecting, combining and using a range of suitable datasets.

* 4 pages, 1 figure

Via

Access Paper or Ask Questions

Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas

Dec 19, 2024

Pietro Bernardelle, Leon Fröhling, Stefano Civelli, Riccardo Lunardi, Kevin Roiter, Gianluca Demartini

Figure 1 for Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas

Figure 2 for Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas

Figure 3 for Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas

Figure 4 for Mapping and Influencing the Political Ideology of Large Language Models using Synthetic Personas

Abstract:The analysis of political biases in large language models (LLMs) has primarily examined these systems as single entities with fixed viewpoints. While various methods exist for measuring such biases, the impact of persona-based prompting on LLMs' political orientation remains unexplored. In this work we leverage PersonaHub, a collection of synthetic persona descriptions, to map the political distribution of persona-based prompted LLMs using the Political Compass Test (PCT). We then examine whether these initial compass distributions can be manipulated through explicit ideological prompting towards diametrically opposed political orientations: right-authoritarian and left-libertarian. Our experiments reveal that synthetic personas predominantly cluster in the left-libertarian quadrant, with models demonstrating varying degrees of responsiveness when prompted with explicit ideological descriptors. While all models demonstrate significant shifts towards right-authoritarian positions, they exhibit more limited shifts towards left-libertarian positions, suggesting an asymmetric response to ideological manipulation that may reflect inherent biases in model training.

* 4 pages, 2 figures, 2 tables

Via

Access Paper or Ask Questions

Perception of Visual Content: Differences Between Humans and Foundation Models

Nov 28, 2024

Nardiena A. Pratama, Shaoyang Fan, Gianluca Demartini

Figure 1 for Perception of Visual Content: Differences Between Humans and Foundation Models

Figure 2 for Perception of Visual Content: Differences Between Humans and Foundation Models

Figure 3 for Perception of Visual Content: Differences Between Humans and Foundation Models

Figure 4 for Perception of Visual Content: Differences Between Humans and Foundation Models

Abstract:Human-annotated content is often used to train machine learning (ML) models. However, recently, language and multi-modal foundational models have been used to replace and scale-up human annotator's efforts. This study compares human-generated and ML-generated annotations of images representing diverse socio-economic contexts. We aim to understand differences in perception and identify potential biases in content interpretation. Our dataset comprises images of people from various geographical regions and income levels washing their hands. We compare human and ML-generated annotations semantically and evaluate their impact on predictive models. Our results show low similarity between human and machine annotations from a low-level perspective, i.e., types of words that appear and sentence structures, but are alike in how similar or dissimilar they perceive images across different regions. Additionally, human annotations resulted in best overall and most balanced region classification performance on the class level, while ML Objects and ML Captions performed best for income regression. Humans and machines' similarity in their lack of bias when perceiving images highlights how they are more alike than what was initially perceived. The superior and fairer performance of using human annotations for region classification and machine annotations for income regression show how important the quality of the images and the discriminative features in the annotations are.

Via

Access Paper or Ask Questions

Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges

Oct 25, 2024

Farid Ariai, Gianluca Demartini

Figure 1 for Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges

Figure 2 for Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges

Figure 3 for Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges

Figure 4 for Natural Language Processing for the Legal Domain: A Survey of Tasks, Datasets, Models, and Challenges

Abstract:Natural Language Processing is revolutionizing the way legal professionals and laypersons operate in the legal field. The considerable potential for Natural Language Processing in the legal sector, especially in developing computational tools for various legal processes, has captured the interest of researchers for years. This survey follows the Preferred Reporting Items for Systematic Reviews and Meta-Analyses framework, reviewing 148 studies, with a final selection of 127 after manual filtering. It explores foundational concepts related to Natural Language Processing in the legal domain, illustrating the unique aspects and challenges of processing legal texts, such as extensive document length, complex language, and limited open legal datasets. We provide an overview of Natural Language Processing tasks specific to legal text, such as Legal Document Summarization, legal Named Entity Recognition, Legal Question Answering, Legal Text Classification, and Legal Judgment Prediction. In the section on legal Language Models, we analyze both developed Language Models and approaches for adapting general Language Models to the legal domain. Additionally, we identify 15 Open Research Challenges, including bias in Artificial Intelligence applications, the need for more robust and interpretable models, and improving explainability to handle the complexities of legal language and reasoning.

* 35 pages

Via

Access Paper or Ask Questions