Abstract:As AI chatbots see increased adoption and integration into everyday life, questions have been raised about the potential impact of human-like or anthropomorphic AI on users. In this work, we investigate the extent to which interactions with ChatGPT (with a focus on Advanced Voice Mode) may impact users' emotional well-being, behaviors and experiences through two parallel studies. To study the affective use of AI chatbots, we perform large-scale automated analysis of ChatGPT platform usage in a privacy-preserving manner, analyzing over 3 million conversations for affective cues and surveying over 4,000 users on their perceptions of ChatGPT. To investigate whether there is a relationship between model usage and emotional well-being, we conduct an Institutional Review Board (IRB)-approved randomized controlled trial (RCT) on close to 1,000 participants over 28 days, examining changes in their emotional well-being as they interact with ChatGPT under different experimental settings. In both on-platform data analysis and the RCT, we observe that very high usage correlates with increased self-reported indicators of dependence. From our RCT, we find that the impact of voice-based interactions on emotional well-being to be highly nuanced, and influenced by factors such as the user's initial emotional state and total usage duration. Overall, our analysis reveals that a small number of users are responsible for a disproportionate share of the most affective cues.
Abstract:One application area of long-term memory (LTM) capabilities with increasing traction is personal AI companions and assistants. With the ability to retain and contextualize past interactions and adapt to user preferences, personal AI companions and assistants promise a profound shift in how we interact with AI and are on track to become indispensable in personal and professional settings. However, this advancement introduces new challenges and vulnerabilities that require careful consideration regarding the deployment and widespread use of these systems. The goal of this paper is to explore the broader implications of building and deploying personal AI applications with LTM capabilities using a holistic evaluation approach. This will be done in three ways: 1) reviewing the technological underpinnings of LTM in Large Language Models, 2) surveying current personal AI companions and assistants, and 3) analyzing critical considerations and implications of deploying and using these applications.
Abstract:This study investigates psychological factors influencing belief in AI predictions about personal behavior, comparing it to belief in astrology and personality-based predictions. Through an experiment with 238 participants, we examined how cognitive style, paranormal beliefs, AI attitudes, personality traits, and other factors affect perceived validity, reliability, usefulness, and personalization of predictions from different sources. Our findings reveal that belief in AI predictions is positively correlated with belief in predictions based on astrology and personality psychology. Notably, paranormal beliefs and positive AI attitudes significantly increased perceived validity, reliability, usefulness, and personalization of AI predictions. Conscientiousness was negatively correlated with belief in predictions across all sources, and interest in the prediction topic increased believability across predictions. Surprisingly, cognitive style did not significantly influence belief in predictions. These results highlight the "rational superstition" phenomenon in AI, where belief is driven more by mental heuristics and intuition than critical evaluation. We discuss implications for designing AI systems and communication strategies that foster appropriate trust and skepticism. This research contributes to our understanding of the psychology of human-AI interaction and offers insights for the design and deployment of AI systems.
Abstract:This study investigates the impact of model size on Online Continual Learning performance, with a focus on catastrophic forgetting. Employing ResNet architectures of varying sizes, the research examines how network depth and width affect model performance in class-incremental learning using the SplitCIFAR-10 dataset. Key findings reveal that larger models do not guarantee better Continual Learning performance; in fact, they often struggle more in adapting to new tasks, particularly in online settings. These results challenge the notion that larger models inherently mitigate catastrophic forgetting, highlighting the nuanced relationship between model size and Continual Learning efficacy. This study contributes to a deeper understanding of model scalability and its practical implications in Continual Learning scenarios.