Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Tomas Ward

Adversarial Testing in LLMs: Insights into Decision-Making Vulnerabilities

May 19, 2025

Lili Zhang, Haomiaomiao Wang, Long Cheng, Libao Deng, Tomas Ward

Abstract:As Large Language Models (LLMs) become increasingly integrated into real-world decision-making systems, understanding their behavioural vulnerabilities remains a critical challenge for AI safety and alignment. While existing evaluation metrics focus primarily on reasoning accuracy or factual correctness, they often overlook whether LLMs are robust to adversarial manipulation or capable of using adaptive strategy in dynamic environments. This paper introduces an adversarial evaluation framework designed to systematically stress-test the decision-making processes of LLMs under interactive and adversarial conditions. Drawing on methodologies from cognitive psychology and game theory, our framework probes how models respond in two canonical tasks: the two-armed bandit task and the Multi-Round Trust Task. These tasks capture key aspects of exploration-exploitation trade-offs, social cooperation, and strategic flexibility. We apply this framework to several state-of-the-art LLMs, including GPT-3.5, GPT-4, Gemini-1.5, and DeepSeek-V3, revealing model-specific susceptibilities to manipulation and rigidity in strategy adaptation. Our findings highlight distinct behavioral patterns across models and emphasize the importance of adaptability and fairness recognition for trustworthy AI deployment. Rather than offering a performance benchmark, this work proposes a methodology for diagnosing decision-making weaknesses in LLM-based agents, providing actionable insights for alignment and safety research.

Via

Access Paper or Ask Questions

Single Channel-based Motor Imagery Classification using Fisher's Ratio and Pearson Correlation

Jun 20, 2024

Sonal Santosh Baberwal, Tomas Ward, Shirley Coyle

Abstract:Motor imagery-based BCI systems have been promising and gaining popularity in rehabilitation and Activities of daily life(ADL). Despite this, the technology is still emerging and has not yet been outside the laboratory constraints. Channel reduction is one contributing avenue to make these systems part of ADL. Although Motor Imagery classification heavily depends on spatial factors, single channel-based classification remains an avenue to be explored thoroughly. Since Fisher's ratio and Pearson Correlation are powerful measures actively used in the domain, we propose an integrated framework (FRPC integrated framework) that integrates Fisher's Ratio to select the best channel and Pearson correlation to select optimal filter banks and extract spectral and temporal features respectively. The framework is tested for a 2-class motor imagery classification on 2 open-source datasets and 1 collected dataset and compared with state-of-art work. Apart from implementing the framework, this study also explores the most optimal channel among all the subjects and later explores classes where the single-channel framework is efficient.

Via

Access Paper or Ask Questions

Privacy-Aware Energy Consumption Modeling of Connected Battery Electric Vehicles using Federated Learning

Dec 12, 2023

Sen Yan, Hongyuan Fang, Ji Li, Tomas Ward, Noel O'Connor, Mingming Liu

Abstract:Battery Electric Vehicles (BEVs) are increasingly significant in modern cities due to their potential to reduce air pollution. Precise and real-time estimation of energy consumption for them is imperative for effective itinerary planning and optimizing vehicle systems, which can reduce driving range anxiety and decrease energy costs. As public awareness of data privacy increases, adopting approaches that safeguard data privacy in the context of BEV energy consumption modeling is crucial. Federated Learning (FL) is a promising solution mitigating the risk of exposing sensitive information to third parties by allowing local data to remain on devices and only sharing model updates with a central server. Our work investigates the potential of using FL methods, such as FedAvg, and FedPer, to improve BEV energy consumption prediction while maintaining user privacy. We conducted experiments using data from 10 BEVs under simulated real-world driving conditions. Our results demonstrate that the FedAvg-LSTM model achieved a reduction of up to 67.84\% in the MAE value of the prediction results. Furthermore, we explored various real-world scenarios and discussed how FL methods can be employed in those cases. Our findings show that FL methods can effectively improve the performance of BEV energy consumption prediction while maintaining user privacy.

* This paper is accepted by IEEE Transactions on Transportation Electrification (TTE) on December 4, 2023. (13 pages, 6 figures, and 6 tables)

Via

Access Paper or Ask Questions

Offline Detection of Misspelled Handwritten Words by Convolving Recognition Model Features with Text Labels

Sep 18, 2023

Andrey Totev, Tomas Ward

Abstract:Offline handwriting recognition (HWR) has improved significantly with the advent of deep learning architectures in recent years. Nevertheless, it remains a challenging problem and practical applications often rely on post-processing techniques for restricting the predicted words via lexicons or language models. Despite their enhanced performance, such systems are less usable in contexts where out-of-vocabulary words are anticipated, e.g. for detecting misspelled words in school assessments. To that end, we introduce the task of comparing a handwriting image to text. To solve the problem, we propose an unrestricted binary classifier, consisting of a HWR feature extractor and a multimodal classification head which convolves the feature extractor output with the vector representation of the input text. Our model's classification head is trained entirely on synthetic data created using a state-of-the-art generative adversarial network. We demonstrate that, while maintaining high recall, the classifier can be calibrated to achieve an average precision increase of 19.5% compared to addressing the task by directly using state-of-the-art HWR models. Such massive performance gains can lead to significant productivity increases in applications utilizing human-in-the-loop automation.

Via

Access Paper or Ask Questions

ECG synthesis with Neural ODE and GAN models

Oct 30, 2021

Mansura Habiba, Eoin Borphy, Barak A. Pearlmutter, Tomas Ward

Figure 1 for ECG synthesis with Neural ODE and GAN models

Figure 2 for ECG synthesis with Neural ODE and GAN models

Figure 3 for ECG synthesis with Neural ODE and GAN models

Figure 4 for ECG synthesis with Neural ODE and GAN models

Abstract:Continuous medical time series data such as ECG is one of the most complex time series due to its dynamic and high dimensional characteristics. In addition, due to its sensitive nature, privacy concerns and legal restrictions, it is often even complex to use actual data for different medical research. As a result, generating continuous medical time series is a very critical research area. Several research works already showed that the ability of generative adversarial networks (GANs) in the case of continuous medical time series generation is promising. Most medical data generation works, such as ECG synthesis, are mainly driven by the GAN model and its variation. On the other hand, Some recent work on Neural Ordinary Differential Equation (Neural ODE) demonstrates its strength against informative missingness, high dimension as well as dynamic nature of continuous time series. Instead of considering continuous-time series as a discrete-time sequence, Neural ODE can train continuous time series in real-time continuously. In this work, we used Neural ODE based model to generate synthetic sine waves and synthetic ECG. We introduced a new technique to design the generative adversarial network with Neural ODE based Generator and Discriminator. We developed three new models to synthesise continuous medical data. Different evaluation metrics are then used to quantitatively assess the quality of generated synthetic data for real-world applications and data analysis. Another goal of this work is to combine the strength of GAN and Neural ODE to generate synthetic continuous medical time series data such as ECG. We also evaluated both the GAN model and the Neural ODE model to understand the comparative efficiency of models from the GAN and Neural ODE family in medical data synthesis.

* Proc. of the International Conference on Electrical, Computer and Energy Technologies (ICECET), 9-10 December 2021, Cape Town-South Africa

Via

Access Paper or Ask Questions

Exploration of Algorithmic Trading Strategies for the Bitcoin Market

Oct 28, 2021

Nathan Crone, Eoin Brophy, Tomas Ward

Figure 1 for Exploration of Algorithmic Trading Strategies for the Bitcoin Market

Figure 2 for Exploration of Algorithmic Trading Strategies for the Bitcoin Market

Figure 3 for Exploration of Algorithmic Trading Strategies for the Bitcoin Market

Figure 4 for Exploration of Algorithmic Trading Strategies for the Bitcoin Market

Abstract:Bitcoin is firmly becoming a mainstream asset in our global society. Its highly volatile nature has traders and speculators flooding into the market to take advantage of its significant price swings in the hope of making money. This work brings an algorithmic trading approach to the Bitcoin market to exploit the variability in its price on a day-to-day basis through the classification of its direction. Building on previous work, in this paper, we utilise both features internal to the Bitcoin network and external features to inform the prediction of various machine learning models. As an empirical test of our models, we evaluate them using a real-world trading strategy on completely unseen data collected throughout the first quarter of 2021. Using only a binary predictor, at the end of our three-month trading period, our models showed an average profit of 86\%, matching the results of the more traditional buy-and-hold strategy. However, after incorporating a risk tolerance score into our trading strategy by utilising the model's prediction confidence scores, our models were 12.5\% more profitable than the simple buy-and-hold strategy. These results indicate the credible potential that machine learning models have in extracting profit from the Bitcoin market and act as a front-runner for further research into real-world Bitcoin trading.

* 9 pages, 4 figures, submitted to Soft Computing Letters

Via

Access Paper or Ask Questions

Generation of Synthetic Electronic Health Records Using a Federated GAN

Sep 06, 2021

John Weldon, Tomas Ward, Eoin Brophy

Figure 1 for Generation of Synthetic Electronic Health Records Using a Federated GAN

Figure 2 for Generation of Synthetic Electronic Health Records Using a Federated GAN

Figure 3 for Generation of Synthetic Electronic Health Records Using a Federated GAN

Figure 4 for Generation of Synthetic Electronic Health Records Using a Federated GAN

Abstract:Sensitive medical data is often subject to strict usage constraints. In this paper, we trained a generative adversarial network (GAN) on real-world electronic health records (EHR). It was then used to create a data-set of "fake" patients through synthetic data generation (SDG) to circumvent usage constraints. This real-world data was tabular, binary, intensive care unit (ICU) patient diagnosis data. The entire data-set was split into separate data silos to mimic real-world scenarios where multiple ICU units across different hospitals may have similarly structured data-sets within their own organisations but do not have access to each other's data-sets. We implemented federated learning (FL) to train separate GANs locally at each organisation, using their unique data silo and then combining the GANs into a single central GAN, without any siloed data ever being exposed. This global, central GAN was then used to generate the synthetic patients data-set. We performed an evaluation of these synthetic patients with statistical measures and through a structured review by a group of medical professionals. It was shown that there was no significant reduction in the quality of the synthetic EHR when we moved between training a single central model and training on separate data silos with individual models before combining them into a central model. This was true for both the statistical evaluation (Root Mean Square Error (RMSE) of 0.0154 for single-source vs. RMSE of 0.0169 for dual-source federated) and also for the medical professionals' evaluation (no quality difference between EHR generated from a single source and EHR generated from multiple sources).

Via

Access Paper or Ask Questions

Generative adversarial networks in time series: A survey and taxonomy

Jul 23, 2021

Eoin Brophy, Zhengwei Wang, Qi She, Tomas Ward

Figure 1 for Generative adversarial networks in time series: A survey and taxonomy

Figure 2 for Generative adversarial networks in time series: A survey and taxonomy

Figure 3 for Generative adversarial networks in time series: A survey and taxonomy

Figure 4 for Generative adversarial networks in time series: A survey and taxonomy

Abstract:Generative adversarial networks (GANs) studies have grown exponentially in the past few years. Their impact has been seen mainly in the computer vision field with realistic image and video manipulation, especially generation, making significant advancements. While these computer vision advances have garnered much attention, GAN applications have diversified across disciplines such as time series and sequence generation. As a relatively new niche for GANs, fieldwork is ongoing to develop high quality, diverse and private time series data. In this paper, we review GAN variants designed for time series related applications. We propose a taxonomy of discrete-variant GANs and continuous-variant GANs, in which GANs deal with discrete time series and continuous time series data. Here we showcase the latest and most popular literature in this field; their architectures, results, and applications. We also provide a list of the most popular evaluation metrics and their suitability across applications. Also presented is a discussion of privacy measures for these GANs and further protections and directions for dealing with sensitive data. We aim to frame clearly and concisely the latest and state-of-the-art research in this area and their applications to real-world technologies.

Via

Access Paper or Ask Questions

Estimation of Continuous Blood Pressure from PPG via a Federated Learning Approach

Feb 24, 2021

Eoin Brophy, Maarten De Vos, Geraldine Boylan, Tomas Ward

Figure 1 for Estimation of Continuous Blood Pressure from PPG via a Federated Learning Approach

Figure 2 for Estimation of Continuous Blood Pressure from PPG via a Federated Learning Approach

Figure 3 for Estimation of Continuous Blood Pressure from PPG via a Federated Learning Approach

Figure 4 for Estimation of Continuous Blood Pressure from PPG via a Federated Learning Approach

Abstract:Ischemic heart disease is the highest cause of mortality globally each year. This not only puts a massive strain on the lives of those affected but also on the public healthcare systems. To understand the dynamics of the healthy and unhealthy heart doctors commonly use electrocardiogram (ECG) and blood pressure (BP) readings. These methods are often quite invasive, in particular when continuous arterial blood pressure (ABP) readings are taken and not to mention very costly. Using machine learning methods we seek to develop a framework that is capable of inferring ABP from a single optical photoplethysmogram (PPG) sensor alone. We train our framework across distributed models and data sources to mimic a large-scale distributed collaborative learning experiment that could be implemented across low-cost wearables. Our time series-to-time series generative adversarial network (T2TGAN) is capable of high-quality continuous ABP generation from a PPG signal with a mean error of 2.54 mmHg and a standard deviation of 23.7 mmHg when estimating mean arterial pressure on a previously unseen, noisy, independent dataset. To our knowledge, this framework is the first example of a GAN capable of continuous ABP generation from an input PPG signal that also uses a federated learning methodology.

Via

Access Paper or Ask Questions