Picture for Denghui Zhang

Denghui Zhang

ISACL: Internal State Analyzer for Copyrighted Training Data Leakage

Add code
Aug 25, 2025
Figure 1 for ISACL: Internal State Analyzer for Copyrighted Training Data Leakage
Figure 2 for ISACL: Internal State Analyzer for Copyrighted Training Data Leakage
Figure 3 for ISACL: Internal State Analyzer for Copyrighted Training Data Leakage
Figure 4 for ISACL: Internal State Analyzer for Copyrighted Training Data Leakage
Viaarxiv icon

Beyond Reactive Safety: Risk-Aware LLM Alignment via Long-Horizon Simulation

Add code
Jun 26, 2025
Viaarxiv icon

Atomic Reasoning for Scientific Table Claim Verification

Add code
Jun 08, 2025
Viaarxiv icon

DEL-ToM: Inference-Time Scaling for Theory-of-Mind Reasoning via Dynamic Epistemic Logic

Add code
May 22, 2025
Viaarxiv icon

RM-R1: Reward Modeling as Reasoning

Add code
May 05, 2025
Viaarxiv icon

Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models

Add code
Apr 05, 2025
Figure 1 for Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
Figure 2 for Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
Figure 3 for Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
Figure 4 for Sensitivity Meets Sparsity: The Impact of Extremely Sparse Parameter Patterns on Theory-of-Mind of Large Language Models
Viaarxiv icon

ALinFiK: Learning to Approximate Linearized Future Influence Kernel for Scalable Third-Parity LLM Data Valuation

Add code
Mar 02, 2025
Viaarxiv icon

A Survey on Data-Centric AI: Tabular Learning from Reinforcement Learning and Generative AI Perspective

Add code
Feb 12, 2025
Figure 1 for A Survey on Data-Centric AI: Tabular Learning from Reinforcement Learning and Generative AI Perspective
Figure 2 for A Survey on Data-Centric AI: Tabular Learning from Reinforcement Learning and Generative AI Perspective
Figure 3 for A Survey on Data-Centric AI: Tabular Learning from Reinforcement Learning and Generative AI Perspective
Viaarxiv icon

Internal Activation as the Polar Star for Steering Unsafe LLM Behavior

Add code
Feb 04, 2025
Figure 1 for Internal Activation as the Polar Star for Steering Unsafe LLM Behavior
Figure 2 for Internal Activation as the Polar Star for Steering Unsafe LLM Behavior
Figure 3 for Internal Activation as the Polar Star for Steering Unsafe LLM Behavior
Figure 4 for Internal Activation as the Polar Star for Steering Unsafe LLM Behavior
Viaarxiv icon

EscapeBench: Pushing Language Models to Think Outside the Box

Add code
Dec 18, 2024
Viaarxiv icon