Abstract:Recent advancements have showcased the capabilities of Large Language Models like GPT4 and Llama2 in tasks such as summarization, translation, and content review. However, their widespread use raises concerns, particularly around the potential for LLMs to spread persuasive, humanlike misinformation at scale, which could significantly influence public opinion. This study examines these risks, focusing on LLMs ability to propagate misinformation as factual. To investigate this, we built the LLM Echo Chamber, a controlled digital environment simulating social media chatrooms, where misinformation often spreads. Echo chambers, where individuals only interact with like minded people, further entrench beliefs. By studying malicious bots spreading misinformation in this environment, we can better understand this phenomenon. We reviewed current LLMs, explored misinformation risks, and applied sota finetuning techniques. Using Microsoft phi2 model, finetuned with our custom dataset, we generated harmful content to create the Echo Chamber. This setup, evaluated by GPT4 for persuasiveness and harmfulness, sheds light on the ethical concerns surrounding LLMs and emphasizes the need for stronger safeguards against misinformation.
Abstract:The transferability of adversarial examples is a crucial aspect of evaluating the robustness of deep learning systems, particularly in black-box scenarios. Although several methods have been proposed to enhance cross-model transferability, little attention has been paid to the transferability of adversarial examples across different tasks. This issue has become increasingly relevant with the emergence of foundational multi-task AI systems such as Visual ChatGPT, rendering the utility of adversarial samples generated by a single task relatively limited. Furthermore, these systems often entail inferential functions beyond mere recognition-like tasks. To address this gap, we propose a novel Visual Relation-based cross-task Adversarial Patch generation method called VRAP, which aims to evaluate the robustness of various visual tasks, especially those involving visual reasoning, such as Visual Question Answering and Image Captioning. VRAP employs scene graphs to combine object recognition-based deception with predicate-based relations elimination, thereby disrupting the visual reasoning information shared among inferential tasks. Our extensive experiments demonstrate that VRAP significantly surpasses previous methods in terms of black-box transferability across diverse visual reasoning tasks.