Abstract:Prompt injection attacks can compromise the security and stability of critical systems, from infrastructure to large web applications. This work curates and augments a prompt injection dataset based on the HackAPrompt Playground Submissions corpus and trains several classifiers, including LSTM, feed forward neural networks, Random Forest, and Naive Bayes, to detect malicious prompts in LLM integrated web applications. The proposed approach improves prompt injection detection and mitigation, helping protect targeted applications and systems.




Abstract:The movie recommender system typically leverages user feedback to provide personalized recommendations that align with user preferences and increase business revenue. This study investigates the impact of gender stereotypes on such systems through a specific attack scenario. In this scenario, an attacker determines users' gender, a private attribute, by exploiting gender stereotypes about movie preferences and analyzing users' feedback data, which is either publicly available or observed within the system. The study consists of two phases. In the first phase, a user study involving 630 participants identified gender stereotypes associated with movie genres, which often influence viewing choices. In the second phase, four inference algorithms were applied to detect gender stereotypes by combining the findings from the first phase with users' feedback data. Results showed that these algorithms performed more effectively than relying solely on feedback data for gender inference. Additionally, we quantified the extent of gender stereotypes to evaluate their broader impact on digital computational science. The latter part of the study utilized two major movie recommender datasets: MovieLens 1M and Yahoo!Movie. Detailed experimental information is available on our GitHub repository: https://github.com/fr-iit/GSMRS