Picture for Yanghao Zhang

Yanghao Zhang

Safeguarding Large Language Models: A Survey

Add code
Jun 03, 2024
Viaarxiv icon

Towards Fairness-Aware Adversarial Learning

Add code
Feb 27, 2024
Viaarxiv icon

Reward Certification for Policy Smoothed Reinforcement Learning

Add code
Dec 12, 2023
Viaarxiv icon

A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation

Add code
May 19, 2023
Viaarxiv icon

Gradient-Guided Dynamic Efficient Adversarial Training

Add code
Mar 04, 2021
Figure 1 for Gradient-Guided Dynamic Efficient Adversarial Training
Figure 2 for Gradient-Guided Dynamic Efficient Adversarial Training
Figure 3 for Gradient-Guided Dynamic Efficient Adversarial Training
Figure 4 for Gradient-Guided Dynamic Efficient Adversarial Training
Viaarxiv icon

Fooling Object Detectors: Adversarial Attacks by Half-Neighbor Masks

Add code
Jan 04, 2021
Figure 1 for Fooling Object Detectors: Adversarial Attacks by Half-Neighbor Masks
Figure 2 for Fooling Object Detectors: Adversarial Attacks by Half-Neighbor Masks
Figure 3 for Fooling Object Detectors: Adversarial Attacks by Half-Neighbor Masks
Viaarxiv icon

Generalizing Universal Adversarial Attacks Beyond Additive Perturbations

Add code
Oct 29, 2020
Figure 1 for Generalizing Universal Adversarial Attacks Beyond Additive Perturbations
Figure 2 for Generalizing Universal Adversarial Attacks Beyond Additive Perturbations
Figure 3 for Generalizing Universal Adversarial Attacks Beyond Additive Perturbations
Figure 4 for Generalizing Universal Adversarial Attacks Beyond Additive Perturbations
Viaarxiv icon