Picture for Gjergji Kasneci

Gjergji Kasneci

Reinforcement Unlearning via Group Relative Policy Optimization

Add code
Jan 28, 2026
Viaarxiv icon

Moral Lenses, Political Coordinates: Towards Ideological Positioning of Morally Conditioned LLMs

Add code
Jan 13, 2026
Viaarxiv icon

Injecting Falsehoods: Adversarial Man-in-the-Middle Attacks Undermining Factual Recall in LLMs

Add code
Nov 08, 2025
Viaarxiv icon

Structured Universal Adversarial Attacks on Object Detection for Video Sequences

Add code
Oct 16, 2025
Viaarxiv icon

CURE: Controlled Unlearning for Robust Embeddings -- Mitigating Conceptual Shortcuts in Pre-Trained Language Models

Add code
Sep 05, 2025
Viaarxiv icon

Probabilistic Aggregation and Targeted Embedding Optimization for Collective Moral Reasoning in Large Language Models

Add code
Jun 18, 2025
Viaarxiv icon

SCISSOR: Mitigating Semantic Bias through Cluster-Aware Siamese Networks for Robust Classification

Add code
Jun 17, 2025
Viaarxiv icon

Position: Uncertainty Quantification Needs Reassessment for Large-language Model Agents

Add code
May 28, 2025
Viaarxiv icon

Graph Style Transfer for Counterfactual Explainability

Add code
May 23, 2025
Viaarxiv icon

Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods

Add code
May 02, 2025
Figure 1 for Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Figure 2 for Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Figure 3 for Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Figure 4 for Gender Bias in Explainability: Investigating Performance Disparity in Post-hoc Methods
Viaarxiv icon