Picture for Mark Russinovich

Mark Russinovich

Microsoft

Learning to Attack and Defend: Adaptive Red Teaming of Language Models via GRPO

Add code
Jun 08, 2026
Viaarxiv icon

MetaBackdoor: Exploiting Positional Encoding as a Backdoor Attack Surface in LLMs

Add code
May 14, 2026
Viaarxiv icon

Optimizing Agent Planning for Security and Autonomy

Add code
Feb 11, 2026
Viaarxiv icon

GRP-Obliteration: Unaligning LLMs With a Single Unlabeled Prompt

Add code
Feb 05, 2026
Viaarxiv icon

LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs

Add code
Jun 12, 2025
Figure 1 for LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs
Figure 2 for LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs
Figure 3 for LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs
Figure 4 for LogiPlan: A Structured Benchmark for Logical Planning and Relational Reasoning in LLMs
Viaarxiv icon

LLMail-Inject: A Dataset from a Realistic Adaptive Prompt Injection Challenge

Add code
Jun 11, 2025
Viaarxiv icon

Securing AI Agents with Information-Flow Control

Add code
May 29, 2025
Viaarxiv icon

Jailbreaking is (Mostly) Simpler Than You Think

Add code
Mar 07, 2025
Figure 1 for Jailbreaking is (Mostly) Simpler Than You Think
Figure 2 for Jailbreaking is (Mostly) Simpler Than You Think
Figure 3 for Jailbreaking is (Mostly) Simpler Than You Think
Viaarxiv icon

Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models

Add code
Feb 20, 2025
Figure 1 for Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models
Figure 2 for Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models
Figure 3 for Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models
Figure 4 for Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models
Viaarxiv icon

Lessons From Red Teaming 100 Generative AI Products

Add code
Jan 13, 2025
Viaarxiv icon