Picture for Huanqian Wang

Huanqian Wang

Model Surgery: Modulating LLM's Behavior Via Simple Parameter Editing

Add code
Jul 11, 2024
Viaarxiv icon

Leveraging Reward Consistency for Interpretable Feature Discovery in Reinforcement Learning

Add code
Sep 04, 2023
Viaarxiv icon