Picture for Dongrui Liu

Dongrui Liu

SEER: Self-Explainability Enhancement of Large Language Models' Representations

Add code
Feb 07, 2025
Viaarxiv icon

Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey

Add code
Dec 03, 2024
Figure 1 for Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Figure 2 for Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Figure 3 for Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Figure 4 for Explainable and Interpretable Multimodal Large Language Models: A Comprehensive Survey
Viaarxiv icon

VLSBench: Unveiling Visual Leakage in Multimodal Safety

Add code
Nov 29, 2024
Viaarxiv icon

DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models

Add code
Oct 22, 2024
Figure 1 for DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models
Figure 2 for DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models
Figure 3 for DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models
Figure 4 for DEAN: Deactivating the Coupled Neurons to Mitigate Fairness-Privacy Conflicts in Large Language Models
Viaarxiv icon

REEF: Representation Encoding Fingerprints for Large Language Models

Add code
Oct 18, 2024
Figure 1 for REEF: Representation Encoding Fingerprints for Large Language Models
Figure 2 for REEF: Representation Encoding Fingerprints for Large Language Models
Figure 3 for REEF: Representation Encoding Fingerprints for Large Language Models
Figure 4 for REEF: Representation Encoding Fingerprints for Large Language Models
Viaarxiv icon

Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues

Add code
Oct 14, 2024
Figure 1 for Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues
Figure 2 for Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues
Figure 3 for Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues
Figure 4 for Derail Yourself: Multi-turn LLM Jailbreak Attack through Self-discovered Clues
Viaarxiv icon

Decouple-Then-Merge: Towards Better Training for Diffusion Models

Add code
Oct 09, 2024
Figure 1 for Decouple-Then-Merge: Towards Better Training for Diffusion Models
Figure 2 for Decouple-Then-Merge: Towards Better Training for Diffusion Models
Figure 3 for Decouple-Then-Merge: Towards Better Training for Diffusion Models
Figure 4 for Decouple-Then-Merge: Towards Better Training for Diffusion Models
Viaarxiv icon

Towards the Dynamics of a DNN Learning Symbolic Interactions

Add code
Jul 27, 2024
Viaarxiv icon

The Better Angels of Machine Personality: How Personality Relates to LLM Safety

Add code
Jul 17, 2024
Figure 1 for The Better Angels of Machine Personality: How Personality Relates to LLM Safety
Figure 2 for The Better Angels of Machine Personality: How Personality Relates to LLM Safety
Figure 3 for The Better Angels of Machine Personality: How Personality Relates to LLM Safety
Figure 4 for The Better Angels of Machine Personality: How Personality Relates to LLM Safety
Viaarxiv icon

MLP Can Be A Good Transformer Learner

Add code
Apr 08, 2024
Viaarxiv icon