Picture for Wenkai Yang

Wenkai Yang

Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization

Add code
Jun 17, 2024
Figure 1 for Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Figure 2 for Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Figure 3 for Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Figure 4 for Super(ficial)-alignment: Strong Models May Deceive Weak Models in Weak-to-Strong Generalization
Viaarxiv icon

Exploring Backdoor Vulnerabilities of Chat Models

Add code
Apr 03, 2024
Viaarxiv icon

Watch Out for Your Agents! Investigating Backdoor Threats to LLM-Based Agents

Add code
Feb 17, 2024
Viaarxiv icon

Enabling Large Language Models to Learn from Rules

Add code
Nov 15, 2023
Viaarxiv icon

Two Stream Scene Understanding on Graph Embedding

Add code
Nov 12, 2023
Viaarxiv icon

Towards Codable Text Watermarking for Large Language Models

Add code
Jul 29, 2023
Viaarxiv icon

Communication Efficient Federated Learning for Multilingual Neural Machine Translation with Adapter

Add code
May 21, 2023
Viaarxiv icon

Fine-Tuning Deteriorates General Textual Out-of-Distribution Detection by Distorting Task-Agnostic Features

Add code
Jan 30, 2023
Viaarxiv icon

Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning

Add code
Jan 26, 2023
Figure 1 for Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning
Figure 2 for Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning
Figure 3 for Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning
Figure 4 for Integrating Local Real Data with Global Gradient Prototypes for Classifier Re-Balancing in Federated Long-Tailed Learning
Viaarxiv icon

When to Trust Aggregated Gradients: Addressing Negative Client Sampling in Federated Learning

Add code
Jan 25, 2023
Viaarxiv icon