Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Alexander N. Gorban

Staining and locking computer vision models without retraining

Jul 29, 2025

Oliver J. Sutton, Qinghua Zhou, George Leete, Alexander N. Gorban, Ivan Y. Tyukin

Abstract:We introduce new methods of staining and locking computer vision models, to protect their owners' intellectual property. Staining, also known as watermarking, embeds secret behaviour into a model which can later be used to identify it, while locking aims to make a model unusable unless a secret trigger is inserted into input images. Unlike existing methods, our algorithms can be used to stain and lock pre-trained models without requiring fine-tuning or retraining, and come with provable, computable guarantees bounding their worst-case false positive rates. The stain and lock are implemented by directly modifying a small number of the model's weights and have minimal impact on the (unlocked) model's performance. Locked models are unlocked by inserting a small `trigger patch' into the corner of the input image. We present experimental results showing the efficacy of our methods and demonstrating their practical performance on a variety of computer vision models.

* 10 pages, 9 pages of appendices, 10 figures

Via

Access Paper or Ask Questions

When fractional quasi p-norms concentrate

May 26, 2025

Ivan Y. Tyukin, Bogdan Grechuk, Evgeny M. Mirkes, Alexander N. Gorban

Abstract:Concentration of distances in high dimension is an important factor for the development and design of stable and reliable data analysis algorithms. In this paper, we address the fundamental long-standing question about the concentration of distances in high dimension for fractional quasi $p$-norms, $p\in(0,1)$. The topic has been at the centre of various theoretical and empirical controversies. Here we, for the first time, identify conditions when fractional quasi $p$-norms concentrate and when they don't. We show that contrary to some earlier suggestions, for broad classes of distributions, fractional quasi $p$-norms admit exponential and uniform in $p$ concentration bounds. For these distributions, the results effectively rule out previously proposed approaches to alleviate concentration by "optimal" setting the values of $p$ in $(0,1)$. At the same time, we specify conditions and the corresponding families of distributions for which one can still control concentration rates by appropriate choices of $p$. We also show that in an arbitrarily small vicinity of a distribution from a large class of distributions for which uniform concentration occurs, there are uncountably many other distributions featuring anti-concentration properties. Importantly, this behavior enables devising relevant data encoding or representation schemes favouring or discouraging distance concentration. The results shed new light on this long-standing problem and resolve the tension around the topic in both theory and empirical evidence reported in the literature.

Via

Access Paper or Ask Questions

Stealth edits for provably fixing or attacking large language models

Jun 18, 2024

Oliver J. Sutton, Qinghua Zhou, Wei Wang, Desmond J. Higham, Alexander N. Gorban, Alexander Bastounis, Ivan Y. Tyukin

Figure 1 for Stealth edits for provably fixing or attacking large language models

Figure 2 for Stealth edits for provably fixing or attacking large language models

Figure 3 for Stealth edits for provably fixing or attacking large language models

Figure 4 for Stealth edits for provably fixing or attacking large language models

Abstract:We reveal new methods and the theoretical foundations of techniques for editing large language models. We also show how the new theory can be used to assess the editability of models and to expose their susceptibility to previously unknown malicious attacks. Our theoretical approach shows that a single metric (a specific measure of the intrinsic dimensionality of the model's features) is fundamental to predicting the success of popular editing approaches, and reveals new bridges between disparate families of editing methods. We collectively refer to these approaches as stealth editing methods, because they aim to directly and inexpensively update a model's weights to correct the model's responses to known hallucinating prompts without otherwise affecting the model's behaviour, without requiring retraining. By carefully applying the insight gleaned from our theoretical investigation, we are able to introduce a new network block -- named a jet-pack block -- which is optimised for highly selective model editing, uses only standard network operations, and can be inserted into existing networks. The intrinsic dimensionality metric also determines the vulnerability of a language model to a stealth attack: a small change to a model's weights which changes its response to a single attacker-chosen prompt. Stealth attacks do not require access to or knowledge of the model's training data, therefore representing a potent yet previously unrecognised threat to redistributed foundation models. They are computationally simple enough to be implemented in malware in many cases. Extensive experimental results illustrate and support the method and its theoretical underpinnings. Demos and source code for editing language models are available at https://github.com/qinghua-zhou/stealth-edits.

* 24 pages, 9 figures. Open source implementation: https://github.com/qinghua-zhou/stealth-edits

Via

Access Paper or Ask Questions

What is Hiding in Medicine's Dark Matter? Learning with Missing Data in Medical Practices

Feb 09, 2024

Neslihan Suzen, Evgeny M. Mirkes, Damian Roland, Jeremy Levesley, Alexander N. Gorban, Tim J. Coats

Figure 1 for What is Hiding in Medicine's Dark Matter? Learning with Missing Data in Medical Practices

Figure 2 for What is Hiding in Medicine's Dark Matter? Learning with Missing Data in Medical Practices

Figure 3 for What is Hiding in Medicine's Dark Matter? Learning with Missing Data in Medical Practices

Figure 4 for What is Hiding in Medicine's Dark Matter? Learning with Missing Data in Medical Practices

Abstract:Electronic patient records (EPRs) produce a wealth of data but contain significant missing information. Understanding and handling this missing data is an important part of clinical data analysis and if left unaddressed could result in bias in analysis and distortion in critical conclusions. Missing data may be linked to health care professional practice patterns and imputation of missing data can increase the validity of clinical decisions. This study focuses on statistical approaches for understanding and interpreting the missing data and machine learning based clinical data imputation using a single centre's paediatric emergency data and the data from UK's largest clinical audit for traumatic injury database (TARN). In the study of 56,961 data points related to initial vital signs and observations taken on children presenting to an Emergency Department, we have shown that missing data are likely to be non-random and how these are linked to health care professional practice patterns. We have then examined 79 TARN fields with missing values for 5,791 trauma cases. Singular Value Decomposition (SVD) and k-Nearest Neighbour (kNN) based missing data imputation methods are used and imputation results against the original dataset are compared and statistically tested. We have concluded that the 1NN imputer is the best imputation which indicates a usual pattern of clinical decision making: find the most similar patients and take their attributes as imputation.

* 2023 IEEE International Conference on Big Data (BigData), 4979-4986
* 8 pages

Via

Access Paper or Ask Questions

Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

Feb 06, 2024

Ivan Y. Tyukin, Tatiana Tyukina, Daniel van Helden, Zedong Zheng, Evgeny M. Mirkes, Oliver J. Sutton, Qinghua Zhou, Alexander N. Gorban, Penelope Allison

Figure 1 for Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

Figure 2 for Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

Figure 3 for Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

Figure 4 for Weakly Supervised Learners for Correction of AI Errors with Provable Performance Guarantees

Abstract:We present a new methodology for handling AI errors by introducing weakly supervised AI error correctors with a priori performance guarantees. These AI correctors are auxiliary maps whose role is to moderate the decisions of some previously constructed underlying classifier by either approving or rejecting its decisions. The rejection of a decision can be used as a signal to suggest abstaining from making a decision. A key technical focus of the work is in providing performance guarantees for these new AI correctors through bounds on the probabilities of incorrect decisions. These bounds are distribution agnostic and do not rely on assumptions on the data dimension. Our empirical example illustrates how the framework can be applied to improve the performance of an image classifier in a challenging real-world task where training data are scarce.

Via

Access Paper or Ask Questions

Exploring the impact of social stress on the adaptive dynamics of COVID-19: Typing the behavior of naïve populations faced with epidemics

Nov 23, 2023

Innokentiy Kastalskiy, Andrei Zinovyev, Evgeny Mirkes, Victor Kazantsev, Alexander N. Gorban

Abstract:In the context of natural disasters, human responses inevitably intertwine with natural factors. The COVID-19 pandemic, as a significant stress factor, has brought to light profound variations among different countries in terms of their adaptive dynamics in addressing the spread of infection outbreaks across different regions. This emphasizes the crucial role of cultural characteristics in natural disaster analysis. The theoretical understanding of large-scale epidemics primarily relies on mean-field kinetic models. However, conventional SIR-like models failed to fully explain the observed phenomena at the onset of the COVID-19 outbreak. These phenomena included the unexpected cessation of exponential growth, the reaching of plateaus, and the occurrence of multi-wave dynamics. In situations where an outbreak of a highly virulent and unfamiliar infection arises, it becomes crucial to respond swiftly at a non-medical level to mitigate the negative socio-economic impact. Here we present a theoretical examination of the first wave of the epidemic based on a simple SIRSS model (SIR with Social Stress). We conduct an analysis of the socio-cultural features of na\"ive population behaviors across various countries worldwide. The unique characteristics of each country/territory are encapsulated in only a few constants within our model, derived from the fitted COVID-19 statistics. These constants also reflect the societal response dynamics to the external stress factor, underscoring the importance of studying the mutual behavior of humanity and natural factors during global social disasters. Based on these distinctive characteristics of specific regions, local authorities can optimize their strategies to effectively combat epidemics until vaccines are developed.

* 27 pages, 15 figures, 1 table, 1 appendix

Via

Access Paper or Ask Questions

The Boundaries of Verifiable Accuracy, Robustness, and Generalisation in Deep Learning

Sep 13, 2023

Alexander Bastounis, Alexander N. Gorban, Anders C. Hansen, Desmond J. Higham, Danil Prokhorov, Oliver Sutton, Ivan Y. Tyukin, Qinghua Zhou

Abstract:In this work, we assess the theoretical limitations of determining guaranteed stability and accuracy of neural networks in classification tasks. We consider classical distribution-agnostic framework and algorithms minimising empirical risks and potentially subjected to some weights regularisation. We show that there is a large family of tasks for which computing and verifying ideal stable and accurate neural networks in the above settings is extremely challenging, if at all possible, even when such ideal solutions exist within the given class of neural architectures.

Via

Access Paper or Ask Questions

How adversarial attacks can disrupt seemingly stable accurate classifiers

Sep 07, 2023

Oliver J. Sutton, Qinghua Zhou, Ivan Y. Tyukin, Alexander N. Gorban, Alexander Bastounis, Desmond J. Higham

Figure 1 for How adversarial attacks can disrupt seemingly stable accurate classifiers

Figure 2 for How adversarial attacks can disrupt seemingly stable accurate classifiers

Figure 3 for How adversarial attacks can disrupt seemingly stable accurate classifiers

Figure 4 for How adversarial attacks can disrupt seemingly stable accurate classifiers

Abstract:Adversarial attacks dramatically change the output of an otherwise accurate learning system using a seemingly inconsequential modification to a piece of input data. Paradoxically, empirical evidence indicates that even systems which are robust to large random perturbations of the input data remain susceptible to small, easily constructed, adversarial perturbations of their inputs. Here, we show that this may be seen as a fundamental feature of classifiers working with high dimensional input data. We introduce a simple generic and generalisable framework for which key behaviours observed in practical systems arise with high probability -- notably the simultaneous susceptibility of the (otherwise accurate) model to easily constructed adversarial attacks, and robustness to random perturbations of the input data. We confirm that the same phenomena are directly observed in practical neural networks trained on standard image classification problems, where even large additive random noise fails to trigger the adversarial instability of the network. A surprising takeaway is that even small margins separating a classifier's decision surface from training and testing data can hide adversarial susceptibility from being detected using randomly sampled perturbations. Counterintuitively, using additive noise during training or testing is therefore inefficient for eradicating or detecting adversarial examples, and more demanding adversarial training is required.

* 11 pages, 8 figures, additional supplementary materials

Via

Access Paper or Ask Questions

Towards a mathematical understanding of learning from few examples with nonlinear feature maps

Nov 07, 2022

Oliver J. Sutton, Alexander N. Gorban, Ivan Y. Tyukin

Abstract:We consider the problem of data classification where the training set consists of just a few data points. We explore this phenomenon mathematically and reveal key relationships between the geometry of an AI model's feature space, the structure of the underlying data distributions, and the model's generalisation capabilities. The main thrust of our analysis is to reveal the influence on the model's generalisation capabilities of nonlinear feature transformations mapping the original data into high, and possibly infinite, dimensional spaces.

* 18 pages, 8 figures

Via

Access Paper or Ask Questions

Domain Adaptation Principal Component Analysis: base linear method for learning with out-of-distribution data

Aug 28, 2022

Evgeny M Mirkes, Jonathan Bac, Aziz Fouché, Sergey V. Stasenko, Andrei Zinovyev, Alexander N. Gorban

Figure 1 for Domain Adaptation Principal Component Analysis: base linear method for learning with out-of-distribution data

Figure 2 for Domain Adaptation Principal Component Analysis: base linear method for learning with out-of-distribution data

Figure 3 for Domain Adaptation Principal Component Analysis: base linear method for learning with out-of-distribution data

Figure 4 for Domain Adaptation Principal Component Analysis: base linear method for learning with out-of-distribution data

Abstract:Domain adaptation is a popular paradigm in modern machine learning which aims at tackling the problem of divergence between training or validation dataset possessing labels for learning and testing a classifier (source domain) and a potentially large unlabeled dataset where the model is exploited (target domain). The task is to find such a common representation of both source and target datasets in which the source dataset is informative for training and such that the divergence between source and target would be minimized. Most popular solutions for domain adaptation are currently based on training neural networks that combine classification and adversarial learning modules, which are data hungry and usually difficult to train. We present a method called Domain Adaptation Principal Component Analysis (DAPCA) which finds a linear reduced data representation useful for solving the domain adaptation task. DAPCA is based on introducing positive and negative weights between pairs of data points and generalizes the supervised extension of principal component analysis. DAPCA represents an iterative algorithm such that at each iteration a simple quadratic optimization problem is solved. The convergence of the algorithm is guaranteed and the number of iterations is small in practice. We validate the suggested algorithm on previously proposed benchmarks for solving the domain adaptation task, and also show the benefit of using DAPCA in the analysis of single cell omics datasets in biomedical applications. Overall, DAPCA can serve as a useful preprocessing step in many machine learning applications leading to reduced dataset representations, taking into account possible divergence between source and target domains.

Via

Access Paper or Ask Questions