Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Claudio Angione

Encrypted Large Model Inference: The Equivariant Encryption Paradigm

Feb 03, 2025

James Buban, Hongyang Zhang, Claudio Angione, Harry Yang, Ahmad Farhan, Seyfal Sultanov, Michael Du, Xuran Ma, Zihao Wang, Yue Zhao(+3 more)

Abstract:Large scale deep learning model, such as modern language models and diffusion architectures, have revolutionized applications ranging from natural language processing to computer vision. However, their deployment in distributed or decentralized environments raises significant privacy concerns, as sensitive data may be exposed during inference. Traditional techniques like secure multi-party computation, homomorphic encryption, and differential privacy offer partial remedies but often incur substantial computational overhead, latency penalties, or limited compatibility with non-linear network operations. In this work, we introduce Equivariant Encryption (EE), a novel paradigm designed to enable secure, "blind" inference on encrypted data with near zero performance overhead. Unlike fully homomorphic approaches that encrypt the entire computational graph, EE selectively obfuscates critical internal representations within neural network layers while preserving the exact functionality of both linear and a prescribed set of non-linear operations. This targeted encryption ensures that raw inputs, intermediate activations, and outputs remain confidential, even when processed on untrusted infrastructure. We detail the theoretical foundations of EE, compare its performance and integration complexity against conventional privacy preserving techniques, and demonstrate its applicability across a range of architectures, from convolutional networks to large language models. Furthermore, our work provides a comprehensive threat analysis, outlining potential attack vectors and baseline strategies, and benchmarks EE against standard inference pipelines in decentralized settings. The results confirm that EE maintains high fidelity and throughput, effectively bridging the gap between robust data confidentiality and the stringent efficiency requirements of modern, large scale model inference.

Via

Access Paper or Ask Questions

Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments

Oct 28, 2024

Yuzhe Yang, Yipeng Du, Ahmad Farhan, Claudio Angione, Yue Zhao, Harry Yang, Fielding Johnston, James Buban, Patrick Colangelo

Figure 1 for Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments

Figure 2 for Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments

Figure 3 for Meta-Learning for Speeding Up Large Model Inference in Decentralized Environments

Abstract:The deployment of large-scale models, such as large language models (LLMs) and sophisticated image generation systems, incurs substantial costs due to their computational demands. To mitigate these costs and address challenges related to scalability and data security, there is a growing shift towards decentralized systems for deploying such models. In these decentralized environments, efficient inference acceleration becomes crucial to manage computational resources effectively and enhance system responsiveness. In this work, we address the challenge of selecting optimal acceleration methods in decentralized systems by introducing a meta-learning-based framework. This framework automates the selection process by learning from historical performance data of various acceleration techniques across different tasks. Unlike traditional methods that rely on random selection or expert intuition, our approach systematically identifies the best acceleration strategies based on the specific characteristics of each task. We demonstrate that our meta-learning framework not only streamlines the decision-making process but also consistently outperforms conventional methods in terms of efficiency and performance. Our results highlight the potential of meta-learning to revolutionize inference acceleration in decentralized AI systems, offering a path towards more democratic and economically feasible artificial intelligence solutions.

Via

Access Paper or Ask Questions

Model Agnostic Hybrid Sharding For Heterogeneous Distributed Inference

Jul 29, 2024

Claudio Angione, Yue Zhao, Harry Yang, Ahmad Farhan, Fielding Johnston, James Buban, Patrick Colangelo

Abstract:The rapid growth of large-scale AI models, particularly large language models has brought significant challenges in data privacy, computational resources, and accessibility. Traditional centralized architectures often struggle to meet required data security and scalability needs which hinders the democratization of AI systems. Nesa introduces a model-agnostic sharding framework designed for decentralized AI inference. Our framework uses blockchain-based sequential deep neural network sharding to distribute computational tasks across a diverse network of nodes based on a personalised heuristic and routing mechanism. This enables efficient distributed training and inference for recent large-scale models even on consumer-grade hardware. We use compression techniques like dynamic blockwise quantization and mixed matrix decomposition to reduce data transfer and memory needs. We also integrate robust security measures, including hardware-based trusted execution environments to ensure data integrity and confidentiality. Evaluating our system across various natural language processing and vision tasks shows that these compression strategies do not compromise model accuracy. Our results highlight the potential to democratize access to cutting-edge AI technologies by enabling secure and efficient inference on a decentralized network.

Via

Access Paper or Ask Questions

Complete Security and Privacy for AI Inference in Decentralized Systems

Jul 28, 2024

Hongyang Zhang, Yue Zhao, Claudio Angione, Harry Yang, James Buban, Ahmad Farhan, Fielding Johnston, Patrick Colangelo

Figure 1 for Complete Security and Privacy for AI Inference in Decentralized Systems

Figure 2 for Complete Security and Privacy for AI Inference in Decentralized Systems

Figure 3 for Complete Security and Privacy for AI Inference in Decentralized Systems

Figure 4 for Complete Security and Privacy for AI Inference in Decentralized Systems

Abstract:The need for data security and model integrity has been accentuated by the rapid adoption of AI and ML in data-driven domains including healthcare, finance, and security. Large models are crucial for tasks like diagnosing diseases and forecasting finances but tend to be delicate and not very scalable. Decentralized systems solve this issue by distributing the workload and reducing central points of failure. Yet, data and processes spread across different nodes can be at risk of unauthorized access, especially when they involve sensitive information. Nesa solves these challenges with a comprehensive framework using multiple techniques to protect data and model outputs. This includes zero-knowledge proofs for secure model verification. The framework also introduces consensus-based verification checks for consistent outputs across nodes and confirms model integrity. Split Learning divides models into segments processed by different nodes for data privacy by preventing full data access at any single point. For hardware-based security, trusted execution environments are used to protect data and computations within secure zones. Nesa's state-of-the-art proofs and principles demonstrate the framework's effectiveness, making it a promising approach for securely democratizing artificial intelligence.

* 25 pages, 5 figures

Via

Access Paper or Ask Questions

A pipeline and comparative study of 12 machine learning models for text classification

Apr 04, 2022

Annalisa Occhipinti, Louis Rogers, Claudio Angione

Figure 1 for A pipeline and comparative study of 12 machine learning models for text classification

Figure 2 for A pipeline and comparative study of 12 machine learning models for text classification

Figure 3 for A pipeline and comparative study of 12 machine learning models for text classification

Figure 4 for A pipeline and comparative study of 12 machine learning models for text classification

Abstract:Text-based communication is highly favoured as a communication method, especially in business environments. As a result, it is often abused by sending malicious messages, e.g., spam emails, to deceive users into relaying personal information, including online accounts credentials or banking details. For this reason, many machine learning methods for text classification have been proposed and incorporated into the services of most email providers. However, optimising text classification algorithms and finding the right tradeoff on their aggressiveness is still a major research problem. We present an updated survey of 12 machine learning text classifiers applied to a public spam corpus. A new pipeline is proposed to optimise hyperparameter selection and improve the models' performance by applying specific methods (based on natural language processing) in the preprocessing stage. Our study aims to provide a new methodology to investigate and optimise the effect of different feature sizes and hyperparameters in machine learning classifiers that are widely used in text classification problems. The classifiers are tested and evaluated on different metrics including F-score (accuracy), precision, recall, and run time. By analysing all these aspects, we show how the proposed pipeline can be used to achieve a good accuracy towards spam filtering on the Enron dataset, a widely used public email corpus. Statistical tests and explainability techniques are applied to provide a robust analysis of the proposed pipeline and interpret the classification outcomes of the 12 machine learning models, also identifying words that drive the classification results. Our analysis shows that it is possible to identify an effective machine learning model to classify the Enron dataset with an F-score of 94%.

* This article has been accepted for publication in Expert Systems with Applications, April 2022. Published by Elsevier. All data, models, and code used in this work are available on GitHub at https://github.com/Angione-Lab/12-machine-learning-models-for-text-classification

Via

Access Paper or Ask Questions

Using Machine Learning to Emulate Agent-Based Simulations

May 05, 2020

Claudio Angione, Eric Silverman, Elisabeth Yaneske

Figure 1 for Using Machine Learning to Emulate Agent-Based Simulations

Figure 2 for Using Machine Learning to Emulate Agent-Based Simulations

Figure 3 for Using Machine Learning to Emulate Agent-Based Simulations

Figure 4 for Using Machine Learning to Emulate Agent-Based Simulations

Abstract:In this paper, we evaluate the performance of multiple machine-learning methods in the emulation of agent-based models (ABMs). ABMs are a popular methodology for modelling complex systems composed of multiple interacting processes. The analysis of ABM outputs is often not straightforward, as the relationships between input parameters can be non-linear or even chaotic, and each individual model run can require significant CPU time. Statistical emulation, in which a statistical model of the ABM is constructed to allow for more in-depth model analysis, has proven valuable for some applications. Here we compare multiple machine-learning methods for ABM emulation in order to determine the approaches best-suited to replicating the complex and non-linear behaviour of ABMs. Our results suggest that, in most scenarios, artificial neural networks (ANNs) and support vector machines outperform Gaussian process emulators, currently the most commonly used method for the emulation of complex computational models. ANNs produced the most accurate model replications in scenarios with high numbers of model runs, although training times for these emulators were considerably longer than for any other method. We propose that users of complex ABMs would benefit from using machine-learning methods for emulation, as this can facilitate more robust sensitivity analyses for their models as well as reducing CPU time consumption when calibrating and analysing the simulation.

* 13 pages, 5 figures

Via

Access Paper or Ask Questions