Picture for Xinyu Xing

Xinyu Xing

A Survey on Explainable Deep Reinforcement Learning

Add code
Feb 08, 2025
Viaarxiv icon

Soft-Label Integration for Robust Toxicity Classification

Add code
Oct 18, 2024
Figure 1 for Soft-Label Integration for Robust Toxicity Classification
Figure 2 for Soft-Label Integration for Robust Toxicity Classification
Figure 3 for Soft-Label Integration for Robust Toxicity Classification
Figure 4 for Soft-Label Integration for Robust Toxicity Classification
Viaarxiv icon

UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM Identification

Add code
Oct 16, 2024
Figure 1 for UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM Identification
Figure 2 for UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM Identification
Figure 3 for UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM Identification
Figure 4 for UTF:Undertrained Tokens as Fingerprints A Novel Approach to LLM Identification
Viaarxiv icon

BlockFound: Customized blockchain foundation model for anomaly detection

Add code
Oct 14, 2024
Figure 1 for BlockFound: Customized blockchain foundation model for anomaly detection
Figure 2 for BlockFound: Customized blockchain foundation model for anomaly detection
Figure 3 for BlockFound: Customized blockchain foundation model for anomaly detection
Figure 4 for BlockFound: Customized blockchain foundation model for anomaly detection
Viaarxiv icon

Enhancing Jailbreak Attack Against Large Language Models through Silent Tokens

Add code
May 31, 2024
Viaarxiv icon

RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation

Add code
May 05, 2024
Figure 1 for RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
Figure 2 for RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
Figure 3 for RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
Figure 4 for RICE: Breaking Through the Training Bottlenecks of Reinforcement Learning with Explanation
Viaarxiv icon

Assessing Prompt Injection Risks in 200+ Custom GPTs

Add code
Nov 20, 2023
Figure 1 for Assessing Prompt Injection Risks in 200+ Custom GPTs
Figure 2 for Assessing Prompt Injection Risks in 200+ Custom GPTs
Figure 3 for Assessing Prompt Injection Risks in 200+ Custom GPTs
Figure 4 for Assessing Prompt Injection Risks in 200+ Custom GPTs
Viaarxiv icon

GPTFUZZER : Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts

Add code
Sep 19, 2023
Viaarxiv icon

BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning

Add code
May 07, 2021
Figure 1 for BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning
Figure 2 for BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning
Figure 3 for BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning
Figure 4 for BACKDOORL: Backdoor Attack against Competitive Reinforcement Learning
Viaarxiv icon

Robust saliency maps with decoy-enhanced saliency score

Add code
Feb 03, 2020
Figure 1 for Robust saliency maps with decoy-enhanced saliency score
Figure 2 for Robust saliency maps with decoy-enhanced saliency score
Figure 3 for Robust saliency maps with decoy-enhanced saliency score
Figure 4 for Robust saliency maps with decoy-enhanced saliency score
Viaarxiv icon