Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Daniel Schwartz

Graph of Attacks with Pruning: Optimizing Stealthy Jailbreak Prompt Generation for Enhanced LLM Content Moderation

Jan 28, 2025

Daniel Schwartz, Dmitriy Bespalov, Zhe Wang, Ninad Kulkarni, Yanjun Qi

Abstract:We present a modular pipeline that automates the generation of stealthy jailbreak prompts derived from high-level content policies, enhancing LLM content moderation. First, we address query inefficiency and jailbreak strength by developing Graph of Attacks with Pruning (GAP), a method that utilizes strategies from prior jailbreaks, resulting in 92% attack success rate on GPT-3.5 using only 54% of the queries of the prior algorithm. Second, we address the cold-start issue by automatically generating seed prompts from the high-level policy using LLMs. Finally, we demonstrate the utility of these generated jailbreak prompts of improving content moderation by fine-tuning PromptGuard, a model trained to detect jailbreaks, increasing its accuracy on the Toxic-Chat dataset from 5.1% to 93.89%.

* 15 pages, 7 figures

Via

Access Paper or Ask Questions

EvoSTS Forecasting: Evolutionary Sparse Time-Series Forecasting

Apr 14, 2022

Ethan Jacob Moyer, Alisha Isabelle Augustin, Satvik Tripathi, Ansh Aashish Dholakia, Andy Nguyen, Isamu Mclean Isozaki, Daniel Schwartz, Edward Kim

Figure 1 for EvoSTS Forecasting: Evolutionary Sparse Time-Series Forecasting

Figure 2 for EvoSTS Forecasting: Evolutionary Sparse Time-Series Forecasting

Figure 3 for EvoSTS Forecasting: Evolutionary Sparse Time-Series Forecasting

Figure 4 for EvoSTS Forecasting: Evolutionary Sparse Time-Series Forecasting

Abstract:In this work, we highlight our novel evolutionary sparse time-series forecasting algorithm also known as EvoSTS. The algorithm attempts to evolutionary prioritize weights of Long Short-Term Memory (LSTM) Network that best minimize the reconstruction loss of a predicted signal using a learned sparse coded dictionary. In each generation of our evolutionary algorithm, a set number of children with the same initial weights are spawned. Each child undergoes a training step and adjusts their weights on the same data. Due to stochastic back-propagation, the set of children has a variety of weights with different levels of performance. The weights that best minimize the reconstruction loss with a given signal dictionary are passed to the next generation. The predictions from the best-performing weights of the first and last generation are compared. We found improvements while comparing the weights of these two generations. However, due to several confounding parameters and hyperparameter limitations, some of the weights had negligible improvements. To the best of our knowledge, this is the first attempt to use sparse coding in this way to optimize time series forecasting model weights, such as those of an LSTM network.

* 5 pages, 2 figures, 2 tables

Via

Access Paper or Ask Questions

Towards Searching Efficient and Accurate Neural Network Architectures in Binary Classification Problems

Jan 16, 2021

Yigit Alparslan, Ethan Jacob Moyer, Isamu Mclean Isozaki, Daniel Schwartz, Adam Dunlop, Shesh Dave, Edward Kim

Figure 1 for Towards Searching Efficient and Accurate Neural Network Architectures in Binary Classification Problems

Figure 2 for Towards Searching Efficient and Accurate Neural Network Architectures in Binary Classification Problems

Figure 3 for Towards Searching Efficient and Accurate Neural Network Architectures in Binary Classification Problems

Figure 4 for Towards Searching Efficient and Accurate Neural Network Architectures in Binary Classification Problems

Abstract:In recent years, deep neural networks have had great success in machine learning and pattern recognition. Architecture size for a neural network contributes significantly to the success of any neural network. In this study, we optimize the selection process by investigating different search algorithms to find a neural network architecture size that yields the highest accuracy. We apply binary search on a very well-defined binary classification network search space and compare the results to those of linear search. We also propose how to relax some of the assumptions regarding the dataset so that our solution can be generalized to any binary classification problem. We report a 100-fold running time improvement over the naive linear search when we apply the binary search method to our datasets in order to find the best architecture candidate. By finding the optimal architecture size for any binary classification problem quickly, we hope that our research contributes to discovering intelligent algorithms for optimizing architecture size selection in machine learning.

* 8 pages, 11 figures

Via

Access Paper or Ask Questions