Abstract:Behavioral game theorists all use experimental data to evaluate predictive models of human behavior. However, they differ greatly in their choice of loss function for these evaluations, with error rate, negative log-likelihood, cross-entropy, Brier score, and L2 error all being common choices. We attempt to offer a principled answer to the question of which loss functions make sense for this task, formalizing desiderata that we argue loss functions should satisfy. We construct a family of loss functions, which we dub "diagonal bounded Bregman divergences", that satisfy all of these axioms and includes the squared L2 error. In fact, the squared L2 error is the only acceptable loss that is relatively commonly used in practice; we thus recommend its continued use to behavioral game theorists.
Abstract:Offline reinforcement learning-learning a policy from a batch of data-is known to be hard: without making strong assumptions, it is easy to construct counterexamples such that existing algorithms fail. In this work, we instead consider a property of certain real world problems where offline reinforcement learning should be effective: those where actions only have limited impact for a part of the state. We formalize and introduce this Action Impact Regularity (AIR) property. We further propose an algorithm that assumes and exploits the AIR property, and bound the suboptimality of the output policy when the MDP satisfies AIR. Finally, we demonstrate that our algorithm outperforms existing offline reinforcement learning algorithms across different data collection policies in two simulated environments where the regularity holds.
Abstract:Vibraimage is a digital system that quantifies a subject's mental and emotional state by analysing video footage of the movements of their head. Vibraimage is used by police, nuclear power station operators, airport security and psychiatrists in Russia, China, Japan and South Korea, and has been deployed at an Olympic Games, FIFA World Cup, and G7 Summit. Yet there is no reliable evidence that the technology is actually effective; indeed, many claims made about its effects seem unprovable. What exactly does vibraimage measure, and how has it acquired the power to penetrate the highest profile and most sensitive security infrastructure across Russia and Asia? I first trace the development of the emotion recognition industry, before examining attempts by vibraimage's developers and affiliates scientifically to legitimate the technology, concluding that the disciplining power and corporate value of vibraimage is generated through its very opacity, in contrast to increasing demands across the social sciences for transparency. I propose the term 'suspect AI' to describe the growing number of systems like vibraimage that algorithmically classify suspects / non-suspects, yet are themselves deeply suspect. Popularising this term may help resist such technologies' reductivist approaches to 'reading' -- and exerting authority over -- emotion, intentionality and agency.