Picture for Victor Gallego

Victor Gallego

Merging Improves Self-Critique Against Jailbreak Attacks

Add code
Jun 11, 2024
Viaarxiv icon

Configurable Safety Tuning of Language Models with Synthetic Preference Data

Add code
Mar 30, 2024
Viaarxiv icon

Distilled Self-Critique of LLMs with Synthetic Data: a Bayesian Perspective

Add code
Dec 04, 2023
Viaarxiv icon

ZYN: Zero-Shot Reward Models with Yes-No Questions

Add code
Aug 11, 2023
Figure 1 for ZYN: Zero-Shot Reward Models with Yes-No Questions
Figure 2 for ZYN: Zero-Shot Reward Models with Yes-No Questions
Figure 3 for ZYN: Zero-Shot Reward Models with Yes-No Questions
Figure 4 for ZYN: Zero-Shot Reward Models with Yes-No Questions
Viaarxiv icon

Personalizing Text-to-Image Generation via Aesthetic Gradients

Add code
Sep 25, 2022
Figure 1 for Personalizing Text-to-Image Generation via Aesthetic Gradients
Figure 2 for Personalizing Text-to-Image Generation via Aesthetic Gradients
Figure 3 for Personalizing Text-to-Image Generation via Aesthetic Gradients
Viaarxiv icon

Protecting Classifiers From Attacks. A Bayesian Approach

Add code
Apr 18, 2020
Figure 1 for Protecting Classifiers From Attacks. A Bayesian Approach
Figure 2 for Protecting Classifiers From Attacks. A Bayesian Approach
Figure 3 for Protecting Classifiers From Attacks. A Bayesian Approach
Figure 4 for Protecting Classifiers From Attacks. A Bayesian Approach
Viaarxiv icon

Adversarial Machine Learning: Perspectives from Adversarial Risk Analysis

Add code
Mar 07, 2020
Figure 1 for Adversarial Machine Learning: Perspectives from Adversarial Risk Analysis
Figure 2 for Adversarial Machine Learning: Perspectives from Adversarial Risk Analysis
Figure 3 for Adversarial Machine Learning: Perspectives from Adversarial Risk Analysis
Figure 4 for Adversarial Machine Learning: Perspectives from Adversarial Risk Analysis
Viaarxiv icon

Variationally Inferred Sampling Through a Refined Bound for Probabilistic Programs

Add code
Sep 23, 2019
Figure 1 for Variationally Inferred Sampling Through a Refined Bound for Probabilistic Programs
Figure 2 for Variationally Inferred Sampling Through a Refined Bound for Probabilistic Programs
Figure 3 for Variationally Inferred Sampling Through a Refined Bound for Probabilistic Programs
Figure 4 for Variationally Inferred Sampling Through a Refined Bound for Probabilistic Programs
Viaarxiv icon

Opponent Aware Reinforcement Learning

Add code
Aug 26, 2019
Figure 1 for Opponent Aware Reinforcement Learning
Figure 2 for Opponent Aware Reinforcement Learning
Figure 3 for Opponent Aware Reinforcement Learning
Figure 4 for Opponent Aware Reinforcement Learning
Viaarxiv icon

Stochastic Gradient MCMC with Repulsive Forces

Add code
Nov 30, 2018
Figure 1 for Stochastic Gradient MCMC with Repulsive Forces
Figure 2 for Stochastic Gradient MCMC with Repulsive Forces
Figure 3 for Stochastic Gradient MCMC with Repulsive Forces
Figure 4 for Stochastic Gradient MCMC with Repulsive Forces
Viaarxiv icon