Picture for Karthikeyan Natesan Ramamurthy

Karthikeyan Natesan Ramamurthy

Final-Model-Only Data Attribution with a Unifying View of Gradient-Based Methods

Add code
Dec 05, 2024
Viaarxiv icon

Evaluating the Prompt Steerability of Large Language Models

Add code
Nov 19, 2024
Viaarxiv icon

Identifying Sub-networks in Neural Networks via Functionally Similar Representations

Add code
Oct 21, 2024
Figure 1 for Identifying Sub-networks in Neural Networks via Functionally Similar Representations
Figure 2 for Identifying Sub-networks in Neural Networks via Functionally Similar Representations
Figure 3 for Identifying Sub-networks in Neural Networks via Functionally Similar Representations
Figure 4 for Identifying Sub-networks in Neural Networks via Functionally Similar Representations
Viaarxiv icon

Programming Refusal with Conditional Activation Steering

Add code
Sep 06, 2024
Figure 1 for Programming Refusal with Conditional Activation Steering
Figure 2 for Programming Refusal with Conditional Activation Steering
Figure 3 for Programming Refusal with Conditional Activation Steering
Figure 4 for Programming Refusal with Conditional Activation Steering
Viaarxiv icon

Value Alignment from Unstructured Text

Add code
Aug 19, 2024
Figure 1 for Value Alignment from Unstructured Text
Figure 2 for Value Alignment from Unstructured Text
Figure 3 for Value Alignment from Unstructured Text
Figure 4 for Value Alignment from Unstructured Text
Viaarxiv icon

Reasoning about concepts with LLMs: Inconsistencies abound

Add code
May 30, 2024
Viaarxiv icon

Multi-Level Explanations for Generative Language Models

Add code
Mar 21, 2024
Figure 1 for Multi-Level Explanations for Generative Language Models
Figure 2 for Multi-Level Explanations for Generative Language Models
Figure 3 for Multi-Level Explanations for Generative Language Models
Figure 4 for Multi-Level Explanations for Generative Language Models
Viaarxiv icon

Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

Add code
Mar 08, 2024
Viaarxiv icon

Trust Regions for Explanations via Black-Box Probabilistic Certification

Add code
Feb 21, 2024
Viaarxiv icon

Ranking Large Language Models without Ground Truth

Add code
Feb 21, 2024
Figure 1 for Ranking Large Language Models without Ground Truth
Figure 2 for Ranking Large Language Models without Ground Truth
Figure 3 for Ranking Large Language Models without Ground Truth
Figure 4 for Ranking Large Language Models without Ground Truth
Viaarxiv icon