Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Aram H. Markosyan

With Great Backbones Comes Great Adversarial Transferability

Jan 21, 2025

Erik Arakelyan, Karen Hambardzumyan, Davit Papikyan, Pasquale Minervini, Albert Gordo, Isabelle Augenstein, Aram H. Markosyan

Figure 1 for With Great Backbones Comes Great Adversarial Transferability

Figure 2 for With Great Backbones Comes Great Adversarial Transferability

Figure 3 for With Great Backbones Comes Great Adversarial Transferability

Figure 4 for With Great Backbones Comes Great Adversarial Transferability

Abstract:Advances in self-supervised learning (SSL) for machine vision have improved representation robustness and model performance, giving rise to pre-trained backbones like \emph{ResNet} and \emph{ViT} models tuned with SSL methods such as \emph{SimCLR}. Due to the computational and data demands of pre-training, the utilization of such backbones becomes a strenuous necessity. However, employing these backbones may inherit vulnerabilities to adversarial attacks. While adversarial robustness has been studied under \emph{white-box} and \emph{black-box} settings, the robustness of models tuned on pre-trained backbones remains largely unexplored. Additionally, the role of tuning meta-information in mitigating exploitation risks is unclear. This work systematically evaluates the adversarial robustness of such models across $20,000$ combinations of tuning meta-information, including fine-tuning techniques, backbone families, datasets, and attack types. We propose using proxy models to transfer attacks, simulating varying levels of target knowledge by fine-tuning these proxies with diverse configurations. Our findings reveal that proxy-based attacks approach the effectiveness of \emph{white-box} methods, even with minimal tuning knowledge. We also introduce a naive "backbone attack," leveraging only the backbone to generate adversarial samples, which outperforms \emph{black-box} attacks and rivals \emph{white-box} methods, highlighting critical risks in model-sharing practices. Finally, our ablations reveal how increasing tuning meta-information impacts attack transferability, measuring each meta-information combination.

Via

Access Paper or Ask Questions

Don't Transform the Code, Code the Transforms: Towards Precise Code Rewriting using LLMs

Oct 11, 2024

Chris Cummins, Volker Seeker, Jordi Armengol-Estapé, Aram H. Markosyan, Gabriel Synnaeve, Hugh Leather

Figure 1 for Don't Transform the Code, Code the Transforms: Towards Precise Code Rewriting using LLMs

Figure 2 for Don't Transform the Code, Code the Transforms: Towards Precise Code Rewriting using LLMs

Figure 3 for Don't Transform the Code, Code the Transforms: Towards Precise Code Rewriting using LLMs

Figure 4 for Don't Transform the Code, Code the Transforms: Towards Precise Code Rewriting using LLMs

Abstract:Tools for rewriting, refactoring and optimizing code should be fast and correct. Large language models (LLMs), by their nature, possess neither of these qualities. Yet, there remains tremendous opportunity in using LLMs to improve code. We explore the use of LLMs not to transform code, but to code transforms. We propose a chain-of-thought approach to synthesizing code transformations from a small number of input/output code examples that incorporates execution and feedback. Unlike the direct rewrite approach, LLM-generated transformations are easy to inspect, debug, and validate. The logic of the rewrite is explicitly coded and easy to adapt. The compute required to run code transformations is minute compared to that of LLM rewriting. We test our approach on 16 Python code transformations and find that LLM- generated transforms are perfectly precise for 7 of them and less imprecise than direct LLM rewriting on the others. We hope to encourage further research to improving the precision of LLM code rewriting.

Via

Access Paper or Ask Questions

Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Feb 26, 2024

Mikayel Samvelyan, Sharath Chandra Raparthy, Andrei Lupu, Eric Hambro, Aram H. Markosyan, Manish Bhatt, Yuning Mao, Minqi Jiang, Jack Parker-Holder, Jakob Foerster(+2 more)

Figure 1 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Figure 2 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Figure 3 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Figure 4 for Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts

Abstract:As large language models (LLMs) become increasingly prevalent across many real-world applications, understanding and enhancing their robustness to user inputs is of paramount importance. Existing methods for identifying adversarial prompts tend to focus on specific domains, lack diversity, or require extensive human annotations. To address these limitations, we present Rainbow Teaming, a novel approach for producing a diverse collection of adversarial prompts. Rainbow Teaming casts adversarial prompt generation as a quality-diversity problem, and uses open-ended search to generate prompts that are both effective and diverse. It can uncover a model's vulnerabilities across a broad range of domains including, in this paper, safety, question answering, and cybersecurity. We also demonstrate that fine-tuning on synthetic data generated by Rainbow Teaming improves the safety of state-of-the-art LLMs without hurting their general capabilities and helpfulness, paving the path to open-ended self-improvement.

Via

Access Paper or Ask Questions

Using Captum to Explain Generative Language Models

Dec 09, 2023

Vivek Miglani, Aobo Yang, Aram H. Markosyan, Diego Garcia-Olano, Narine Kokhlikyan

Abstract:Captum is a comprehensive library for model explainability in PyTorch, offering a range of methods from the interpretability literature to enhance users' understanding of PyTorch models. In this paper, we introduce new features in Captum that are specifically designed to analyze the behavior of generative language models. We provide an overview of the available functionalities and example applications of their potential for understanding learned associations within generative language models.

Via

Access Paper or Ask Questions

Identifying and Disentangling Spurious Features in Pretrained Image Representations

Jun 22, 2023

Rafayel Darbinyan, Hrayr Harutyunyan, Aram H. Markosyan, Hrant Khachatrian

Figure 1 for Identifying and Disentangling Spurious Features in Pretrained Image Representations

Figure 2 for Identifying and Disentangling Spurious Features in Pretrained Image Representations

Figure 3 for Identifying and Disentangling Spurious Features in Pretrained Image Representations

Figure 4 for Identifying and Disentangling Spurious Features in Pretrained Image Representations

Abstract:Neural networks employ spurious correlations in their predictions, resulting in decreased performance when these correlations do not hold. Recent works suggest fixing pretrained representations and training a classification head that does not use spurious features. We investigate how spurious features are represented in pretrained representations and explore strategies for removing information about spurious features. Considering the Waterbirds dataset and a few pretrained representations, we find that even with full knowledge of spurious features, their removal is not straightforward due to entangled representation. To address this, we propose a linear autoencoder training method to separate the representation into core, spurious, and other features. We propose two effective spurious feature removal approaches that are applied to the encoding and significantly improve classification performance measured by worst group accuracy.

Via

Access Paper or Ask Questions

Tell Your Story: Task-Oriented Dialogs for Interactive Content Creation

Nov 08, 2022

Satwik Kottur, Seungwhan Moon, Aram H. Markosyan, Hardik Shah, Babak Damavandi, Alborz Geramifard

Abstract:People capture photos and videos to relive and share memories of personal significance. Recently, media montages (stories) have become a popular mode of sharing these memories due to their intuitive and powerful storytelling capabilities. However, creating such montages usually involves a lot of manual searches, clicks, and selections that are time-consuming and cumbersome, adversely affecting user experiences. To alleviate this, we propose task-oriented dialogs for montage creation as a novel interactive tool to seamlessly search, compile, and edit montages from a media collection. To the best of our knowledge, our work is the first to leverage multi-turn conversations for such a challenging application, extending the previous literature studying simple media retrieval tasks. We collect a new dataset C3 (Conversational Content Creation), comprising 10k dialogs conditioned on media montages simulated from a large media collection. We take a simulate-and-paraphrase approach to collect these dialogs to be both cost and time efficient, while drawing from natural language distribution. Our analysis and benchmarking of state-of-the-art language models showcase the multimodal challenges present in the dataset. Lastly, we present a real-world mobile demo application that shows the feasibility of the proposed work in real-world applications. Our code and data will be made publicly available.

* 8 pages, 6 figures, 2 tables

Via

Access Paper or Ask Questions

Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models

May 22, 2022

Kushal Tirumala, Aram H. Markosyan, Luke Zettlemoyer, Armen Aghajanyan

Figure 1 for Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models

Figure 2 for Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models

Figure 3 for Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models

Figure 4 for Memorization Without Overfitting: Analyzing the Training Dynamics of Large Language Models

Abstract:Despite their wide adoption, the underlying training and memorization dynamics of very large language models is not well understood. We empirically study exact memorization in causal and masked language modeling, across model sizes and throughout the training process. We measure the effects of dataset size, learning rate, and model size on memorization, finding that larger language models memorize training data faster across all settings. Surprisingly, we show that larger models can memorize a larger portion of the data before over-fitting and tend to forget less throughout the training process. We also analyze the memorization dynamics of different parts of speech and find that models memorize nouns and numbers first; we hypothesize and provide empirical evidence that nouns and numbers act as a unique identifier for memorizing individual training examples. Together, these findings present another piece of the broader puzzle of trying to understand what actually improves as models get bigger.

Via

Access Paper or Ask Questions