Department of Communications and Networking, Aalto University, Finland
Abstract:Developing a reinforcement learning (RL) agent often involves identifying effective values for a large number of parameters, covering the policy, reward function, environment, and the agent's internal architecture, such as parameters controlling how the peripheral vision and memory modules work. Critically, since these parameters are interrelated in complex ways, optimizing them can be viewed as a black box optimization problem, which is especially challenging for non-experts. Although existing optimization-as-a-service platforms (e.g., Vizier, Optuna) can handle such problems, they are impractical for RL systems, as users must manually map each parameter to different components, making the process cumbersome and error-prone. They also require deep understanding of the optimization process, limiting their application outside ML experts and restricting access for fields like cognitive science, which models human decision-making. To tackle these challenges, we present AgentForge, a flexible low-code framework to optimize any parameter set across an RL system. AgentForge allows the user to perform individual or joint optimization of parameter sets. An optimization problem can be defined in a few lines of code and handed to any of the interfaced optimizers. We evaluated its performance in a challenging vision-based RL problem. AgentForge enables practitioners to develop RL agents without requiring extensive coding or deep expertise in optimization.
Abstract:Present-day graphical user interfaces (GUIs) exhibit diverse arrangements of text, graphics, and interactive elements such as buttons and menus, but representations of GUIs have not kept up. They do not encapsulate both semantic and visuo-spatial relationships among elements. To seize machine learning's potential for GUIs more efficiently, Graph4GUI exploits graph neural networks to capture individual elements' properties and their semantic-visuo-spatial constraints in a layout. The learned representation demonstrated its effectiveness in multiple tasks, especially generating designs in a challenging GUI autocompletion task, which involved predicting the positions of remaining unplaced elements in a partially completed GUI. The new model's suggestions showed alignment and visual appeal superior to the baseline method and received higher subjective ratings for preference. Furthermore, we demonstrate the practical benefits and efficiency advantages designers perceive when utilizing our model as an autocompletion plug-in.
Abstract:From a visual perception perspective, modern graphical user interfaces (GUIs) comprise a complex graphics-rich two-dimensional visuospatial arrangement of text, images, and interactive objects such as buttons and menus. While existing models can accurately predict regions and objects that are likely to attract attention ``on average'', so far there is no scanpath model capable of predicting scanpaths for an individual. To close this gap, we introduce EyeFormer, which leverages a Transformer architecture as a policy network to guide a deep reinforcement learning algorithm that controls gaze locations. Our model has the unique capability of producing personalized predictions when given a few user scanpath samples. It can predict full scanpath information, including fixation positions and duration, across individuals and various stimulus types. Additionally, we demonstrate applications in GUI layout optimization driven by our model. Our software and models will be publicly available.
Abstract:This paper presents a model of pedestrian crossing decisions, based on the theory of computational rationality. It is assumed that crossing decisions are boundedly optimal, with bounds on optimality arising from human cognitive limitations. While previous models of pedestrian behaviour have been either 'black-box' machine learning models or mechanistic models with explicit assumptions about cognitive factors, we combine both approaches. Specifically, we model mechanistically noisy human visual perception and assumed rewards in crossing, but we use reinforcement learning to learn bounded optimal behaviour policy. The model reproduces a larger number of known empirical phenomena than previous models, in particular: (1) the effect of the time to arrival of an approaching vehicle on whether the pedestrian accepts the gap, the effect of the vehicle's speed on both (2) gap acceptance and (3) pedestrian timing of crossing in front of yielding vehicles, and (4) the effect on this crossing timing of the stopping distance of the yielding vehicle. Notably, our findings suggest that behaviours previously framed as 'biases' in decision-making, such as speed-dependent gap acceptance, might instead be a product of rational adaptation to the constraints of visual perception. Our approach also permits fitting the parameters of cognitive constraints and rewards per individual, to better account for individual differences. To conclude, by leveraging both RL and mechanistic modelling, our model offers novel insights about pedestrian behaviour, and may provide a useful foundation for more accurate and scalable pedestrian models.
Abstract:Understanding the interaction between different road users is critical for road safety and automated vehicles (AVs). Existing mathematical models on this topic have been proposed based mostly on either cognitive or machine learning (ML) approaches. However, current cognitive models are incapable of simulating road user trajectories in general scenarios, and ML models lack a focus on the mechanisms generating the behavior and take a high-level perspective which can cause failures to capture important human-like behaviors. Here, we develop a model of human pedestrian crossing decisions based on computational rationality, an approach using deep reinforcement learning (RL) to learn boundedly optimal behavior policies given human constraints, in our case a model of the limited human visual system. We show that the proposed combined cognitive-RL model captures human-like patterns of gap acceptance and crossing initiation time. Interestingly, our model's decisions are sensitive to not only the time gap, but also the speed of the approaching vehicle, something which has been described as a "bias" in human gap acceptance behavior. However, our results suggest that this is instead a rational adaption to human perceptual limitations. Moreover, we demonstrate an approach to accounting for individual differences in computational rationality models, by conditioning the RL policy on the parameters of the human constraints. Our results demonstrate the feasibility of generating more human-like road user behavior by combining RL with cognitive models.
Abstract:Designers reportedly struggle with design optimization tasks where they are asked to find a combination of design parameters that maximizes a given set of objectives. In HCI, design optimization problems are often exceedingly complex, involving multiple objectives and expensive empirical evaluations. Model-based computational design algorithms assist designers by generating design examples during design, however they assume a model of the interaction domain. Black box methods for assistance, on the other hand, can work with any design problem. However, virtually all empirical studies of this human-in-the-loop approach have been carried out by either researchers or end-users. The question stands out if such methods can help designers in realistic tasks. In this paper, we study Bayesian optimization as an algorithmic method to guide the design optimization process. It operates by proposing to a designer which design candidate to try next, given previous observations. We report observations from a comparative study with 40 novice designers who were tasked to optimize a complex 3D touch interaction technique. The optimizer helped designers explore larger proportions of the design space and arrive at a better solution, however they reported lower agency and expressiveness. Designers guided by an optimizer reported lower mental effort but also felt less creative and less in charge of the progress. We conclude that human-in-the-loop optimization can support novice designers in cases where agency is not critical.
Abstract:Affordance refers to the perception of possible actions allowed by an object. Despite its relevance to human-computer interaction, no existing theory explains the mechanisms that underpin affordance-formation; that is, how affordances are discovered and adapted via interaction. We propose an integrative theory of affordance-formation based on the theory of reinforcement learning in cognitive sciences. The key assumption is that users learn to associate promising motor actions to percepts via experience when reinforcement signals (success/failure) are present. They also learn to categorize actions (e.g., "rotating" a dial), giving them the ability to name and reason about affordance. Upon encountering novel widgets, their ability to generalize these actions determines their ability to perceive affordances. We implement this theory in a virtual robot model, which demonstrates human-like adaptation of affordance in interactive widgets tasks. While its predictions align with trends in human data, humans are able to adapt affordances faster, suggesting the existence of additional mechanisms.
Abstract:AI for supporting designers needs to be rethought. It should aim to cooperate, not automate, by supporting and leveraging the creativity and problem-solving of designers. The challenge for such AI is how to infer designers' goals and then help them without being needlessly disruptive. We present AI-assisted design: a framework for creating such AI, built around generative user models which enable reasoning about designers' goals, reasoning, and capabilities.
Abstract:Adapting an interface requires taking into account both the positive and negative effects that changes may have on the user. A carelessly picked adaptation may impose high costs to the user -- for example, due to surprise or relearning effort -- or "trap" the process to a suboptimal design immaturely. However, effects on users are hard to predict as they depend on factors that are latent and evolve over the course of interaction. We propose a novel approach for adaptive user interfaces that yields a conservative adaptation policy: It finds beneficial changes when there are such and avoids changes when there are none. Our model-based reinforcement learning method plans sequences of adaptations and consults predictive HCI models to estimate their effects. We present empirical and simulation results from the case of adaptive menus, showing that the method outperforms both a non-adaptive and a frequency-based policy.
Abstract:The paper presents a novel model-based method for intelligent tutoring, with particular emphasis on the problem of selecting teaching interventions in interaction with humans. Whereas previous work has focused on either personalization of teaching or optimization of teaching intervention sequences, the proposed individualized model-based planning approach represents convergence of these two lines of research. Model-based planning picks the best interventions via interactive learning of a user memory model's parameters. The approach is novel in its use of a cognitive model that can account for several key individual- and material-specific characteristics related to recall/forgetting, along with a planning technique that considers users' practice schedules. Taking a rule-based approach as a baseline, the authors evaluated the method's benefits in a controlled study of artificial teaching in second-language vocabulary learning (N=53).