Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Roma Patel

Value Profiles for Encoding Human Variation

Mar 19, 2025

Taylor Sorensen, Pushkar Mishra, Roma Patel, Michael Henry Tessler, Michiel Bakker, Georgina Evans, Iason Gabriel, Noah Goodman, Verena Rieser

Abstract:Modelling human variation in rating tasks is crucial for enabling AI systems for personalization, pluralistic model alignment, and computational social science. We propose representing individuals using value profiles -- natural language descriptions of underlying values compressed from in-context demonstrations -- along with a steerable decoder model to estimate ratings conditioned on a value profile or other rater information. To measure the predictive information in rater representations, we introduce an information-theoretic methodology. We find that demonstrations contain the most information, followed by value profiles and then demographics. However, value profiles offer advantages in terms of scrutability, interpretability, and steerability due to their compressed natural language format. Value profiles effectively compress the useful information from demonstrations (>70% information preservation). Furthermore, clustering value profiles to identify similarly behaving individuals better explains rater variation than the most predictive demographic groupings. Going beyond test set performance, we show that the decoder models interpretably change ratings according to semantic profile differences, are well-calibrated, and can help explain instance-level disagreement by simulating an annotator population. These results demonstrate that value profiles offer novel, predictive ways to describe individual variation beyond demographics or group information.

Via

Access Paper or Ask Questions

Utility-inspired Reward Transformations Improve Reinforcement Learning Training of Language Models

Jan 08, 2025

Roberto-Rafael Maura-Rivero, Chirag Nagpal, Roma Patel, Francesco Visin

Abstract:Current methods that train large language models (LLMs) with reinforcement learning feedback, often resort to averaging outputs of multiple rewards functions during training. This overlooks crucial aspects of individual reward dimensions and inter-reward dependencies that can lead to sub-optimal outcomes in generations. In this work, we show how linear aggregation of rewards exhibits some vulnerabilities that can lead to undesired properties of generated text. We then propose a transformation of reward functions inspired by economic theory of utility functions (specifically Inada conditions), that enhances sensitivity to low reward values while diminishing sensitivity to already high values. We compare our approach to the existing baseline methods that linearly aggregate rewards and show how the Inada-inspired reward feedback is superior to traditional weighted averaging. We quantitatively and qualitatively analyse the difference in the methods, and see that models trained with Inada-transformations score as more helpful while being less harmful.

Via

Access Paper or Ask Questions

Insights on Disagreement Patterns in Multimodal Safety Perception across Diverse Rater Groups

Oct 22, 2024

Charvi Rastogi, Tian Huey Teh, Pushkar Mishra, Roma Patel, Zoe Ashwood, Aida Mostafazadeh Davani, Mark Diaz, Michela Paganini, Alicia Parrish, Ding Wang(+3 more)

Figure 1 for Insights on Disagreement Patterns in Multimodal Safety Perception across Diverse Rater Groups

Figure 2 for Insights on Disagreement Patterns in Multimodal Safety Perception across Diverse Rater Groups

Figure 3 for Insights on Disagreement Patterns in Multimodal Safety Perception across Diverse Rater Groups

Figure 4 for Insights on Disagreement Patterns in Multimodal Safety Perception across Diverse Rater Groups

Abstract:AI systems crucially rely on human ratings, but these ratings are often aggregated, obscuring the inherent diversity of perspectives in real-world phenomenon. This is particularly concerning when evaluating the safety of generative AI, where perceptions and associated harms can vary significantly across socio-cultural contexts. While recent research has studied the impact of demographic differences on annotating text, there is limited understanding of how these subjective variations affect multimodal safety in generative AI. To address this, we conduct a large-scale study employing highly-parallel safety ratings of about 1000 text-to-image (T2I) generations from a demographically diverse rater pool of 630 raters balanced across 30 intersectional groups across age, gender, and ethnicity. Our study shows that (1) there are significant differences across demographic groups (including intersectional groups) on how severe they assess the harm to be, and that these differences vary across different types of safety violations, (2) the diverse rater pool captures annotation patterns that are substantially different from expert raters trained on specific set of safety policies, and (3) the differences we observe in T2I safety are distinct from previously documented group level differences in text-based safety tasks. To further understand these varying perspectives, we conduct a qualitative analysis of the open-ended explanations provided by raters. This analysis reveals core differences into the reasons why different groups perceive harms in T2I generations. Our findings underscore the critical need for incorporating diverse perspectives into safety evaluation of generative AI ensuring these systems are truly inclusive and reflect the values of all users.

* 20 pages, 7 figures

Via

Access Paper or Ask Questions

Skill Generalization with Verbs

Oct 18, 2024

Rachel Ma, Lyndon Lam, Benjamin A. Spiegel, Aditya Ganeshan, Roma Patel, Ben Abbatematteo, David Paulius, Stefanie Tellex, George Konidaris

Figure 1 for Skill Generalization with Verbs

Figure 2 for Skill Generalization with Verbs

Figure 3 for Skill Generalization with Verbs

Figure 4 for Skill Generalization with Verbs

Abstract:It is imperative that robots can understand natural language commands issued by humans. Such commands typically contain verbs that signify what action should be performed on a given object and that are applicable to many objects. We propose a method for generalizing manipulation skills to novel objects using verbs. Our method learns a probabilistic classifier that determines whether a given object trajectory can be described by a specific verb. We show that this classifier accurately generalizes to novel object categories with an average accuracy of 76.69% across 13 object categories and 14 verbs. We then perform policy search over the object kinematics to find an object trajectory that maximizes classifier prediction for a given verb. Our method allows a robot to generate a trajectory for a novel object based on a verb, which can then be used as input to a motion planner. We show that our model can generate trajectories that are usable for executing five verb commands applied to novel instances of two different object categories on a real robot.

* 7 pages + 2 pages (references), 6 figures. Accepted at IROS 2023. Code, dataset info and demo videos can be found at: https://rachelma80000.github.io/SkillGenVerbs/

Via

Access Paper or Ask Questions

States as Strings as Strategies: Steering Language Models with Game-Theoretic Solvers

Feb 06, 2024

Ian Gemp, Yoram Bachrach, Marc Lanctot, Roma Patel, Vibhavari Dasagi, Luke Marris, Georgios Piliouras, Siqi Liu, Karl Tuyls

Figure 1 for States as Strings as Strategies: Steering Language Models with Game-Theoretic Solvers

Figure 2 for States as Strings as Strategies: Steering Language Models with Game-Theoretic Solvers

Figure 3 for States as Strings as Strategies: Steering Language Models with Game-Theoretic Solvers

Figure 4 for States as Strings as Strategies: Steering Language Models with Game-Theoretic Solvers

Abstract:Game theory is the study of mathematical models of strategic interactions among rational agents. Language is a key medium of interaction for humans, though it has historically proven difficult to model dialogue and its strategic motivations mathematically. A suitable model of the players, strategies, and payoffs associated with linguistic interactions (i.e., a binding to the conventional symbolic logic of game theory) would enable existing game-theoretic algorithms to provide strategic solutions in the space of language. In other words, a binding could provide a route to computing stable, rational conversational strategies in dialogue. Large language models (LLMs) have arguably reached a point where their generative capabilities can enable realistic, human-like simulations of natural dialogue. By prompting them in various ways, we can steer their responses towards different output utterances. Leveraging the expressivity of natural language, LLMs can also help us quickly generate new dialogue scenarios, which are grounded in real world applications. In this work, we present one possible binding from dialogue to game theory as well as generalizations of existing equilibrium finding algorithms to this setting. In addition, by exploiting LLMs generation capabilities along with our proposed binding, we can synthesize a large repository of formally-defined games in which one can study and test game-theoretic solution concepts. We also demonstrate how one can combine LLM-driven game generation, game-theoretic solvers, and imitation learning to construct a process for improving the strategic capabilities of LLMs.

* 32 pages, 8 figures, code available @ https://github.com/google-deepmind/open_spiel/blob/master/open_spiel/python/games/chat_game.py

Via

Access Paper or Ask Questions

Pragmatics in Grounded Language Learning: Phenomena, Tasks, and Modeling Approaches

Nov 15, 2022

Daniel Fried, Nicholas Tomlin, Jennifer Hu, Roma Patel, Aida Nematzadeh

Abstract:People rely heavily on context to enrich meaning beyond what is literally said, enabling concise but effective communication. To interact successfully and naturally with people, user-facing artificial intelligence systems will require similar skills in pragmatics: relying on various types of context -- from shared linguistic goals and conventions, to the visual and embodied world -- to use language effectively. We survey existing grounded settings and pragmatic modeling approaches and analyze how the task goals, environmental contexts, and communicative affordances in each work enrich linguistic meaning. We present recommendations for future grounded task design to naturally elicit pragmatic phenomena, and suggest directions that focus on a broader range of communicative contexts and affordances.

Via

Access Paper or Ask Questions

RLang: A Declarative Language for Expression Prior Knowledge for Reinforcement Learning

Aug 16, 2022

Rafael Rodriguez-Sanchez, Benjamin A. Spiegel, Jennifer Wang, Roma Patel, Stefanie Tellex, George Konidaris

Figure 1 for RLang: A Declarative Language for Expression Prior Knowledge for Reinforcement Learning

Figure 2 for RLang: A Declarative Language for Expression Prior Knowledge for Reinforcement Learning

Figure 3 for RLang: A Declarative Language for Expression Prior Knowledge for Reinforcement Learning

Figure 4 for RLang: A Declarative Language for Expression Prior Knowledge for Reinforcement Learning

Abstract:Communicating useful background knowledge to reinforcement learning (RL) agents is an important and effective method for accelerating learning. We introduce RLang, a domain-specific language (DSL) for communicating domain knowledge to an RL agent. Unlike other existing DSLs proposed by the RL community that ground to single elements of a decision-making formalism (e.g., the reward function or policy function), RLang can specify information about every element of a Markov decision process. We define precise syntax and grounding semantics for RLang, and provide a parser implementation that grounds RLang programs to an algorithm-agnostic partial world model and policy that can be exploited by an RL agent. We provide a series of example RLang programs, and demonstrate how different RL methods can exploit the resulting knowledge, including model-free and model-based tabular algorithms, hierarchical approaches, and deep RL algorithms (including both policy gradient and value-based methods).

Via

Access Paper or Ask Questions

Generalizing to New Domains by Mapping Natural Language to Lifted LTL

Oct 11, 2021

Eric Hsiung, Hiloni Mehta, Junchi Chu, Xinyu Liu, Roma Patel, Stefanie Tellex, George Konidaris

Figure 1 for Generalizing to New Domains by Mapping Natural Language to Lifted LTL

Figure 2 for Generalizing to New Domains by Mapping Natural Language to Lifted LTL

Figure 3 for Generalizing to New Domains by Mapping Natural Language to Lifted LTL

Figure 4 for Generalizing to New Domains by Mapping Natural Language to Lifted LTL

Abstract:Recent work on using natural language to specify commands to robots has grounded that language to LTL. However, mapping natural language task specifications to LTL task specifications using language models require probability distributions over finite vocabulary. Existing state-of-the-art methods have extended this finite vocabulary to include unseen terms from the input sequence to improve output generalization. However, novel out-of-vocabulary atomic propositions cannot be generated using these methods. To overcome this, we introduce an intermediate contextual query representation which can be learned from single positive task specification examples, associating a contextual query with an LTL template. We demonstrate that this intermediate representation allows for generalization over unseen object references, assuming accurate groundings are available. We compare our method of mapping natural language task specifications to intermediate contextual queries against state-of-the-art CopyNet models capable of translating natural language to LTL, by evaluating whether correct LTL for manipulation and navigation task specifications can be output, and show that our method outperforms the CopyNet model on unseen object references. We demonstrate that the grounded LTL our method outputs can be used for planning in a simulated OO-MDP environment. Finally, we discuss some common failure modes encountered when translating natural language task specifications to grounded LTL.

* 7 pages (6 + 1 references page), 3 figures, 2 tables. Submitted to ICRA 2022

Via

Access Paper or Ask Questions

Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding

Oct 15, 2020

Alexander Ku, Peter Anderson, Roma Patel, Eugene Ie, Jason Baldridge

Figure 1 for Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding

Figure 2 for Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding

Figure 3 for Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding

Figure 4 for Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding

Abstract:We introduce Room-Across-Room (RxR), a new Vision-and-Language Navigation (VLN) dataset. RxR is multilingual (English, Hindi, and Telugu) and larger (more paths and instructions) than other VLN datasets. It emphasizes the role of language in VLN by addressing known biases in paths and eliciting more references to visible entities. Furthermore, each word in an instruction is time-aligned to the virtual poses of instruction creators and validators. We establish baseline scores for monolingual and multilingual settings and multitask learning when including Room-to-Room annotations. We also provide results for a model that learns from synchronized pose traces by focusing only on portions of the panorama attended to in human demonstrations. The size, scope and detail of RxR dramatically expands the frontier for research on embodied language agents in simulated, photo-realistic environments.

* EMNLP 2020

Via

Access Paper or Ask Questions

Robot Object Retrieval with Contextual Natural Language Queries

Jun 23, 2020

Thao Nguyen, Nakul Gopalan, Roma Patel, Matt Corsaro, Ellie Pavlick, Stefanie Tellex

Figure 1 for Robot Object Retrieval with Contextual Natural Language Queries

Figure 2 for Robot Object Retrieval with Contextual Natural Language Queries

Figure 3 for Robot Object Retrieval with Contextual Natural Language Queries

Figure 4 for Robot Object Retrieval with Contextual Natural Language Queries

Abstract:Natural language object retrieval is a highly useful yet challenging task for robots in human-centric environments. Previous work has primarily focused on commands specifying the desired object's type such as "scissors" and/or visual attributes such as "red," thus limiting the robot to only known object classes. We develop a model to retrieve objects based on descriptions of their usage. The model takes in a language command containing a verb, for example "Hand me something to cut," and RGB images of candidate objects and selects the object that best satisfies the task specified by the verb. Our model directly predicts an object's appearance from the object's use specified by a verb phrase. We do not need to explicitly specify an object's class label. Our approach allows us to predict high level concepts like an object's utility based on the language query. Based on contextual information present in the language commands, our model can generalize to unseen object classes and unknown nouns in the commands. Our model correctly selects objects out of sets of five candidates to fulfill natural language commands, and achieves an average accuracy of 62.3% on a held-out test set of unseen ImageNet object classes and 53.0% on unseen object classes and unknown nouns. Our model also achieves an average accuracy of 54.7% on unseen YCB object classes, which have a different image distribution from ImageNet objects. We demonstrate our model on a KUKA LBR iiwa robot arm, enabling the robot to retrieve objects based on natural language descriptions of their usage. We also present a new dataset of 655 verb-object pairs denoting object usage over 50 verbs and 216 object classes.

Via

Access Paper or Ask Questions