Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

CJ Barberan

Can LLMs be Fooled? Investigating Vulnerabilities in LLMs

Jul 30, 2024

Sara Abdali, Jia He, CJ Barberan, Richard Anarfi

Figure 1 for Can LLMs be Fooled? Investigating Vulnerabilities in LLMs

Abstract:The advent of Large Language Models (LLMs) has garnered significant popularity and wielded immense power across various domains within Natural Language Processing (NLP). While their capabilities are undeniably impressive, it is crucial to identify and scrutinize their vulnerabilities especially when those vulnerabilities can have costly consequences. One such LLM, trained to provide a concise summarization from medical documents could unequivocally leak personal patient data when prompted surreptitiously. This is just one of many unfortunate examples that have been unveiled and further research is necessary to comprehend the underlying reasons behind such vulnerabilities. In this study, we delve into multiple sections of vulnerabilities which are model-based, training-time, inference-time vulnerabilities, and discuss mitigation strategies including "Model Editing" which aims at modifying LLMs behavior, and "Chroma Teaming" which incorporates synergy of multiple teaming strategies to enhance LLMs' resilience. This paper will synthesize the findings from each vulnerability section and propose new directions of research and development. By understanding the focal points of current vulnerabilities, we can better anticipate and mitigate future risks, paving the road for more robust and secure LLMs.

* 14 pages, 1 figure. arXiv admin note: text overlap with arXiv:2403.12503

Via

Access Paper or Ask Questions

Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices

Mar 19, 2024

Sara Abdali, Richard Anarfi, CJ Barberan, Jia He

Figure 1 for Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices

Figure 2 for Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices

Figure 3 for Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices

Figure 4 for Securing Large Language Models: Threats, Vulnerabilities and Responsible Practices

Abstract:Large language models (LLMs) have significantly transformed the landscape of Natural Language Processing (NLP). Their impact extends across a diverse spectrum of tasks, revolutionizing how we approach language understanding and generations. Nevertheless, alongside their remarkable utility, LLMs introduce critical security and risk considerations. These challenges warrant careful examination to ensure responsible deployment and safeguard against potential vulnerabilities. This research paper thoroughly investigates security and privacy concerns related to LLMs from five thematic perspectives: security and privacy concerns, vulnerabilities against adversarial attacks, potential harms caused by misuses of LLMs, mitigation strategies to address these challenges while identifying limitations of current strategies. Lastly, the paper recommends promising avenues for future research to enhance the security and risk management of LLMs.

Via

Access Paper or Ask Questions

Decoding the AI Pen: Techniques and Challenges in Detecting AI-Generated Text

Mar 09, 2024

Sara Abdali, Richard Anarfi, CJ Barberan, Jia He

Abstract:Large Language Models (LLMs) have revolutionized the field of Natural Language Generation (NLG) by demonstrating an impressive ability to generate human-like text. However, their widespread usage introduces challenges that necessitate thoughtful examination, ethical scrutiny, and responsible practices. In this study, we delve into these challenges, explore existing strategies for mitigating them, with a particular emphasis on identifying AI-generated text as the ultimate solution. Additionally, we assess the feasibility of detection from a theoretical perspective and propose novel research directions to address the current limitations in this domain.

Via

Access Paper or Ask Questions

NeuroView-RNN: It's About Time

Feb 23, 2022

CJ Barberan, Sina Alemohammad, Naiming Liu, Randall Balestriero, Richard G. Baraniuk

Figure 1 for NeuroView-RNN: It's About Time

Figure 2 for NeuroView-RNN: It's About Time

Figure 3 for NeuroView-RNN: It's About Time

Figure 4 for NeuroView-RNN: It's About Time

Abstract:Recurrent Neural Networks (RNNs) are important tools for processing sequential data such as time-series or video. Interpretability is defined as the ability to be understood by a person and is different from explainability, which is the ability to be explained in a mathematical formulation. A key interpretability issue with RNNs is that it is not clear how each hidden state per time step contributes to the decision-making process in a quantitative manner. We propose NeuroView-RNN as a family of new RNN architectures that explains how all the time steps are used for the decision-making process. Each member of the family is derived from a standard RNN architecture by concatenation of the hidden steps into a global linear classifier. The global linear classifier has all the hidden states as the input, so the weights of the classifier have a linear mapping to the hidden states. Hence, from the weights, NeuroView-RNN can quantify how important each time step is to a particular decision. As a bonus, NeuroView-RNN also offers higher accuracy in many cases compared to the RNNs and their variants. We showcase the benefits of NeuroView-RNN by evaluating on a multitude of diverse time-series datasets.

* 21 pages, 13 figures, 9 tables

Via

Access Paper or Ask Questions

NeuroView: Explainable Deep Network Decision Making

Oct 15, 2021

CJ Barberan, Randall Balestriero, Richard G. Baraniuk

Figure 1 for NeuroView: Explainable Deep Network Decision Making

Figure 2 for NeuroView: Explainable Deep Network Decision Making

Figure 3 for NeuroView: Explainable Deep Network Decision Making

Figure 4 for NeuroView: Explainable Deep Network Decision Making

Abstract:Deep neural networks (DNs) provide superhuman performance in numerous computer vision tasks, yet it remains unclear exactly which of a DN's units contribute to a particular decision. NeuroView is a new family of DN architectures that are interpretable/explainable by design. Each member of the family is derived from a standard DN architecture by vector quantizing the unit output values and feeding them into a global linear classifier. The resulting architecture establishes a direct, causal link between the state of each unit and the classification decision. We validate NeuroView on standard datasets and classification tasks to show that how its unit/class mapping aids in understanding the decision-making process.

* 12 pages, 7 figures

Via

Access Paper or Ask Questions

NFT-K: Non-Fungible Tangent Kernels

Oct 11, 2021

Sina Alemohammad, Hossein Babaei, CJ Barberan, Naiming Liu, Lorenzo Luzi, Blake Mason, Richard G. Baraniuk

Figure 1 for NFT-K: Non-Fungible Tangent Kernels

Figure 2 for NFT-K: Non-Fungible Tangent Kernels

Figure 3 for NFT-K: Non-Fungible Tangent Kernels

Abstract:Deep neural networks have become essential for numerous applications due to their strong empirical performance such as vision, RL, and classification. Unfortunately, these networks are quite difficult to interpret, and this limits their applicability in settings where interpretability is important for safety, such as medical imaging. One type of deep neural network is neural tangent kernel that is similar to a kernel machine that provides some aspect of interpretability. To further contribute interpretability with respect to classification and the layers, we develop a new network as a combination of multiple neural tangent kernels, one to model each layer of the deep neural network individually as opposed to past work which attempts to represent the entire network via a single neural tangent kernel. We demonstrate the interpretability of this model on two datasets, showing that the multiple kernels model elucidates the interplay between the layers and predictions.

Via

Access Paper or Ask Questions