Abstract:Threat modeling is a crucial component of cybersecurity, particularly for industries such as banking, where the security of financial data is paramount. Traditional threat modeling approaches require expert intervention and manual effort, often leading to inefficiencies and human error. The advent of Large Language Models (LLMs) offers a promising avenue for automating these processes, enhancing both efficiency and efficacy. However, this transition is not straightforward due to three main challenges: (1) the lack of publicly available, domain-specific datasets, (2) the need for tailored models to handle complex banking system architectures, and (3) the requirement for real-time, adaptive mitigation strategies that align with compliance standards like NIST 800-53. In this paper, we introduce ThreatModeling-LLM, a novel and adaptable framework that automates threat modeling for banking systems using LLMs. ThreatModeling-LLM operates in three stages: 1) dataset creation, 2) prompt engineering and 3) model fine-tuning. We first generate a benchmark dataset using Microsoft Threat Modeling Tool (TMT). Then, we apply Chain of Thought (CoT) and Optimization by PROmpting (OPRO) on the pre-trained LLMs to optimize the initial prompt. Lastly, we fine-tune the LLM using Low-Rank Adaptation (LoRA) based on the benchmark dataset and the optimized prompt to improve the threat identification and mitigation generation capabilities of pre-trained LLMs.
Abstract:This paper reports on Team Northeastern's Avatar system for telepresence, and our holistic approach to meet the ANA Avatar XPRIZE Final testing task requirements. The system features a dual-arm configuration with hydraulically actuated glove-gripper pair for haptic force feedback. Our proposed Avatar system was evaluated in the ANA Avatar XPRIZE Finals and completed all 10 tasks, scored 14.5 points out of 15.0, and received the 3rd Place Award. We provide the details of improvements over our first generation Avatar, covering manipulation, perception, locomotion, power, network, and controller design. We also extensively discuss the major lessons learned during our participation in the competition.
Abstract:Deception is rapidly growing as an important tool for cyber defence, complementing existing perimeter security measures to rapidly detect breaches and data theft. One of the factors limiting the use of deception has been the cost of generating realistic artefacts by hand. Recent advances in Machine Learning have, however, created opportunities for scalable, automated generation of realistic deceptions. This vision paper describes the opportunities and challenges involved in developing models to mimic many common elements of the IT stack for deception effects.
Abstract:The impressive growth of data throughput in optical microscopy has triggered a widespread use of supervised learning (SL) models running on compressed image datasets for efficient automated analysis. However, since lossy image compression risks to produce unpredictable artifacts, quantifying the effect of data compression on SL applications is of pivotal importance to assess their reliability, especially for clinical use. We propose an experimental method to evaluate the tolerability of image compression distortions in 2D and 3D cell segmentation SL tasks: predictions on compressed data are compared to the raw predictive uncertainty, which is numerically estimated from the raw noise statistics measured through sensor calibration. We show that predictions on object- and image-specific segmentation parameters can be altered by up to 15% and more than 10 standard deviations after 16-to-8 bits downsampling or JPEG compression. In contrast, a recently developed lossless compression algorithm provides a prediction spread which is statistically equivalent to that stemming from raw noise, while providing a compression ratio of up to 10:1. By setting a lower bound to the SL predictive uncertainty, our technique can be generalized to validate a variety of data analysis pipelines in SL-assisted fields.
Abstract:Monoclonal antibodies constitute one of the most important strategies to treat patients suffering from cancers such as hematological malignancies and solid tumors. In order to guarantee the quality of those preparations prepared at hospital, quality control has to be developed. The aim of this study was to explore a noninvasive, nondestructive, and rapid analytical method to ensure the quality of the final preparation without causing any delay in the process. We analyzed four mAbs (Inlfiximab, Bevacizumab, Ramucirumab and Rituximab) diluted at therapeutic concentration in chloride sodium 0.9% using Raman spectroscopy. To reduce the prediction errors obtained with traditional chemometric data analysis, we explored a data-driven approach using statistical machine learning methods where preprocessing and predictive models are jointly optimized. We prepared a data analytics workflow and submitted the problem to a collaborative data challenge platform called Rapid Analytics and Model Prototyping (RAMP). This allowed to use solutions from about 300 data scientists during five days of collaborative work. The prediction of the four mAbs samples was considerably improved with a misclassification rate and the mean error rate of 0.8% and 4%, respectively.