Picture for Yevgeniy Vorobeychik

Yevgeniy Vorobeychik

AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs

Add code
Oct 14, 2024
Figure 1 for AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
Figure 2 for AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
Figure 3 for AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
Figure 4 for AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
Viaarxiv icon

Adaptive Recruitment Resource Allocation to Improve Cohort Representativeness in Participatory Biomedical Datasets

Add code
Aug 02, 2024
Viaarxiv icon

Dataset Representativeness and Downstream Task Fairness

Add code
Jun 28, 2024
Viaarxiv icon

Adversarial Machine Unlearning

Add code
Jun 11, 2024
Figure 1 for Adversarial Machine Unlearning
Figure 2 for Adversarial Machine Unlearning
Figure 3 for Adversarial Machine Unlearning
Figure 4 for Adversarial Machine Unlearning
Viaarxiv icon

GOMAA-Geo: GOal Modality Agnostic Active Geo-localization

Add code
Jun 04, 2024
Viaarxiv icon

Verified Safe Reinforcement Learning for Neural Network Dynamic Models

Add code
May 25, 2024
Viaarxiv icon

Axioms for AI Alignment from Human Feedback

Add code
May 23, 2024
Viaarxiv icon

Learning Linear Utility Functions From Pairwise Comparison Queries

Add code
May 07, 2024
Viaarxiv icon

Attacks on Node Attributes in Graph Neural Networks

Add code
Feb 19, 2024
Viaarxiv icon

Learning Interpretable Policies in Hindsight-Observable POMDPs through Partially Supervised Reinforcement Learning

Add code
Feb 14, 2024
Viaarxiv icon