Picture for W. Bradley Knox

W. Bradley Knox

Harmful Traits of AI Companions

Add code
Nov 18, 2025
Figure 1 for Harmful Traits of AI Companions
Viaarxiv icon

State Your Intention to Steer Your Attention: An AI Assistant for Intentional Digital Living

Add code
Oct 16, 2025
Viaarxiv icon

CTRL-Rec: Controlling Recommender Systems With Natural Language

Add code
Oct 14, 2025
Figure 1 for CTRL-Rec: Controlling Recommender Systems With Natural Language
Figure 2 for CTRL-Rec: Controlling Recommender Systems With Natural Language
Figure 3 for CTRL-Rec: Controlling Recommender Systems With Natural Language
Figure 4 for CTRL-Rec: Controlling Recommender Systems With Natural Language
Viaarxiv icon

Towards Improving Reward Design in RL: A Reward Alignment Metric for RL Practitioners

Add code
Mar 08, 2025
Viaarxiv icon

Influencing Humans to Conform to Preference Models for RLHF

Add code
Jan 11, 2025
Figure 1 for Influencing Humans to Conform to Preference Models for RLHF
Figure 2 for Influencing Humans to Conform to Preference Models for RLHF
Figure 3 for Influencing Humans to Conform to Preference Models for RLHF
Figure 4 for Influencing Humans to Conform to Preference Models for RLHF
Viaarxiv icon

MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control

Add code
Oct 23, 2024
Figure 1 for MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control
Figure 2 for MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control
Figure 3 for MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control
Figure 4 for MobileSafetyBench: Evaluating Safety of Autonomous Agents in Mobile Device Control
Viaarxiv icon

Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions

Add code
Oct 17, 2024
Figure 1 for Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Figure 2 for Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Figure 3 for Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Figure 4 for Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Viaarxiv icon

Contrastive Preference Learning: Learning from Human Feedback without RL

Add code
Oct 24, 2023
Figure 1 for Contrastive Preference Learning: Learning from Human Feedback without RL
Figure 2 for Contrastive Preference Learning: Learning from Human Feedback without RL
Figure 3 for Contrastive Preference Learning: Learning from Human Feedback without RL
Figure 4 for Contrastive Preference Learning: Learning from Human Feedback without RL
Viaarxiv icon

Learning Optimal Advantage from Preferences and Mistaking it for Reward

Add code
Oct 03, 2023
Figure 1 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Figure 2 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Figure 3 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Figure 4 for Learning Optimal Advantage from Preferences and Mistaking it for Reward
Viaarxiv icon

Models of human preference for learning reward functions

Add code
Jun 05, 2022
Figure 1 for Models of human preference for learning reward functions
Figure 2 for Models of human preference for learning reward functions
Figure 3 for Models of human preference for learning reward functions
Figure 4 for Models of human preference for learning reward functions
Viaarxiv icon