Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Conor Atkins

Bot Wars Evolved: Orchestrating Competing LLMs in a Counterstrike Against Phone Scams

Mar 10, 2025

Nardine Basta, Conor Atkins, Dali Kaafar

Abstract:We present "Bot Wars," a framework using Large Language Models (LLMs) scam-baiters to counter phone scams through simulated adversarial dialogues. Our key contribution is a formal foundation for strategy emergence through chain-of-thought reasoning without explicit optimization. Through a novel two-layer prompt architecture, our framework enables LLMs to craft demographically authentic victim personas while maintaining strategic coherence. We evaluate our approach using a dataset of 3,200 scam dialogues validated against 179 hours of human scam-baiting interactions, demonstrating its effectiveness in capturing complex adversarial dynamics. Our systematic evaluation through cognitive, quantitative, and content-specific metrics shows that GPT-4 excels in dialogue naturalness and persona authenticity, while Deepseek demonstrates superior engagement sustainability.

* Pacific-Asia Conference on Knowledge Discovery and Data Mining, PAKDD 2025

Via

Access Paper or Ask Questions

ConvoCache: Smart Re-Use of Chatbot Responses

Jun 26, 2024

Conor Atkins, Ian Wood, Mohamed Ali Kaafar, Hassan Asghar, Nardine Basta, Michal Kepkowski

Figure 1 for ConvoCache: Smart Re-Use of Chatbot Responses

Figure 2 for ConvoCache: Smart Re-Use of Chatbot Responses

Figure 3 for ConvoCache: Smart Re-Use of Chatbot Responses

Figure 4 for ConvoCache: Smart Re-Use of Chatbot Responses

Abstract:We present ConvoCache, a conversational caching system that solves the problem of slow and expensive generative AI models in spoken chatbots. ConvoCache finds a semantically similar prompt in the past and reuses the response. In this paper we evaluate ConvoCache on the DailyDialog dataset. We find that ConvoCache can apply a UniEval coherence threshold of 90% and respond to 89% of prompts using the cache with an average latency of 214ms, replacing LLM and voice synthesis that can take over 1s. To further reduce latency we test prefetching and find limited usefulness. Prefetching with 80% of a request leads to a 63% hit rate, and a drop in overall coherence. ConvoCache can be used with any chatbot to reduce costs by reducing usage of generative AI by up to 89%.

* Accepted to appear at Interspeech 2024

Via

Access Paper or Ask Questions

Those Aren't Your Memories, They're Somebody Else's: Seeding Misinformation in Chat Bot Memories

Apr 06, 2023

Conor Atkins, Benjamin Zi Hao Zhao, Hassan Jameel Asghar, Ian Wood, Mohamed Ali Kaafar

Figure 1 for Those Aren't Your Memories, They're Somebody Else's: Seeding Misinformation in Chat Bot Memories

Figure 2 for Those Aren't Your Memories, They're Somebody Else's: Seeding Misinformation in Chat Bot Memories

Figure 3 for Those Aren't Your Memories, They're Somebody Else's: Seeding Misinformation in Chat Bot Memories

Figure 4 for Those Aren't Your Memories, They're Somebody Else's: Seeding Misinformation in Chat Bot Memories

Abstract:One of the new developments in chit-chat bots is a long-term memory mechanism that remembers information from past conversations for increasing engagement and consistency of responses. The bot is designed to extract knowledge of personal nature from their conversation partner, e.g., stating preference for a particular color. In this paper, we show that this memory mechanism can result in unintended behavior. In particular, we found that one can combine a personal statement with an informative statement that would lead the bot to remember the informative statement alongside personal knowledge in its long term memory. This means that the bot can be tricked into remembering misinformation which it would regurgitate as statements of fact when recalling information relevant to the topic of conversation. We demonstrate this vulnerability on the BlenderBot 2 framework implemented on the ParlAI platform and provide examples on the more recent and significantly larger BlenderBot 3 model. We generate 150 examples of misinformation, of which 114 (76%) were remembered by BlenderBot 2 when combined with a personal statement. We further assessed the risk of this misinformation being recalled after intervening innocuous conversation and in response to multiple questions relevant to the injected memory. Our evaluation was performed on both the memory-only and the combination of memory and internet search modes of BlenderBot 2. From the combinations of these variables, we generated 12,890 conversations and analyzed recalled misinformation in the responses. We found that when the chat bot is questioned on the misinformation topic, it was 328% more likely to respond with the misinformation as fact when the misinformation was in the long-term memory.

* To be published in 21st International Conference on Applied Cryptography and Network Security, ACNS 2023

Via

Access Paper or Ask Questions