Picture for Leo Schwinn

Leo Schwinn

Extracting Unlearned Information from LLMs with Activation Steering

Add code
Nov 04, 2024
Figure 1 for Extracting Unlearned Information from LLMs with Activation Steering
Figure 2 for Extracting Unlearned Information from LLMs with Activation Steering
Figure 3 for Extracting Unlearned Information from LLMs with Activation Steering
Figure 4 for Extracting Unlearned Information from LLMs with Activation Steering
Viaarxiv icon

A Probabilistic Perspective on Unlearning and Alignment for Large Language Models

Add code
Oct 04, 2024
Viaarxiv icon

Flow Matching with Gaussian Process Priors for Probabilistic Time Series Forecasting

Add code
Oct 03, 2024
Viaarxiv icon

Caption-Driven Explorations: Aligning Image and Text Embeddings through Human-Inspired Foveated Vision

Add code
Aug 19, 2024
Viaarxiv icon

Relaxing Graph Transformers for Adversarial Attacks

Add code
Jul 16, 2024
Viaarxiv icon

Large-Scale Dataset Pruning in Adversarial Training through Data Importance Extrapolation

Add code
Jun 19, 2024
Viaarxiv icon

Efficient Time Series Processing for Transformers and State-Space Models through Token Merging

Add code
May 28, 2024
Viaarxiv icon

Efficient Adversarial Training in LLMs with Continuous Attacks

Add code
May 24, 2024
Viaarxiv icon

A Unified Approach Towards Active Learning and Out-of-Distribution Detection

Add code
May 18, 2024
Viaarxiv icon

Soft Prompt Threats: Attacking Safety Alignment and Unlearning in Open-Source LLMs through the Embedding Space

Add code
Feb 14, 2024
Viaarxiv icon