Picture for Arush Tagade

Arush Tagade

LLM Assertiveness can be Mechanistically Decomposed into Emotional and Logical Components

Add code
Aug 24, 2025
Viaarxiv icon

Benchmarking the Discovery Engine

Add code
Jul 01, 2025
Viaarxiv icon

The SaTML '24 CNN Interpretability Competition: New Innovations for Concept-Level Interpretability

Add code
Apr 03, 2024
Viaarxiv icon

Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation

Add code
Nov 06, 2023
Viaarxiv icon

Prototype Generation: Robust Feature Visualisation for Data Independent Interpretability

Add code
Sep 29, 2023
Figure 1 for Prototype Generation: Robust Feature Visualisation for Data Independent Interpretability
Figure 2 for Prototype Generation: Robust Feature Visualisation for Data Independent Interpretability
Figure 3 for Prototype Generation: Robust Feature Visualisation for Data Independent Interpretability
Figure 4 for Prototype Generation: Robust Feature Visualisation for Data Independent Interpretability
Viaarxiv icon

Why do CNNs excel at feature extraction? A mathematical explanation

Add code
Jul 03, 2023
Viaarxiv icon