Abstract:There is increased interest in using generative AI to create 3D spaces for Virtual Reality (VR) applications. However, today's models produce artificial environments, falling short of supporting collaborative tasks that benefit from incorporating the user's physical context. To generate environments that support VR telepresence, we introduce SpaceBlender, a novel pipeline that utilizes generative AI techniques to blend users' physical surroundings into unified virtual spaces. This pipeline transforms user-provided 2D images into context-rich 3D environments through an iterative process consisting of depth estimation, mesh alignment, and diffusion-based space completion guided by geometric priors and adaptive text prompts. In a preliminary within-subjects study, where 20 participants performed a collaborative VR affinity diagramming task in pairs, we compared SpaceBlender with a generic virtual environment and a state-of-the-art scene generation framework, evaluating its ability to create virtual spaces suitable for collaboration. Participants appreciated the enhanced familiarity and context provided by SpaceBlender but also noted complexities in the generative environments that could detract from task focus. Drawing on participant feedback, we propose directions for improving the pipeline and discuss the value and design of blended spaces for different scenarios.
Abstract:Today's video-conferencing tools support a rich range of professional and social activities, but their generic, grid-based environments cannot be easily adapted to meet the varying needs of distributed collaborators. To enable end-user customization, we developed BlendScape, a system for meeting participants to compose video-conferencing environments tailored to their collaboration context by leveraging AI image generation techniques. BlendScape supports flexible representations of task spaces by blending users' physical or virtual backgrounds into unified environments and implements multimodal interaction techniques to steer the generation. Through an evaluation with 15 end-users, we investigated their customization preferences for work and social scenarios. Participants could rapidly express their design intentions with BlendScape and envisioned using the system to structure collaboration in future meetings, but experienced challenges with preventing distracting elements. We implement scenarios to demonstrate BlendScape's expressiveness in supporting distributed collaboration techniques from prior work and propose composition techniques to improve the quality of environments.
Abstract:This paper contributes to a taxonomy of augmented reality and robotics based on a survey of 460 research papers. Augmented and mixed reality (AR/MR) have emerged as a new way to enhance human-robot interaction (HRI) and robotic interfaces (e.g., actuated and shape-changing interfaces). Recently, an increasing number of studies in HCI, HRI, and robotics have demonstrated how AR enables better interactions between people and robots. However, often research remains focused on individual explorations and key design strategies, and research questions are rarely analyzed systematically. In this paper, we synthesize and categorize this research field in the following dimensions: 1) approaches to augmenting reality; 2) characteristics of robots; 3) purposes and benefits; 4) classification of presented information; 5) design components and strategies for visual augmentation; 6) interaction techniques and modalities; 7) application domains; and 8) evaluation strategies. We formulate key challenges and opportunities to guide and inform future research in AR and robotics.
Abstract:We introduce Deep Thermal Imaging, a new approach for close-range automatic recognition of materials to enhance the understanding of people and ubiquitous technologies of their proximal environment. Our approach uses a low-cost mobile thermal camera integrated into a smartphone to capture thermal textures. A deep neural network classifies these textures into material types. This approach works effectively without the need for ambient light sources or direct contact with materials. Furthermore, the use of a deep learning network removes the need to handcraft the set of features for different materials. We evaluated the performance of the system by training it to recognise 32 material types in both indoor and outdoor environments. Our approach produced recognition accuracies above 98% in 14,860 images of 15 indoor materials and above 89% in 26,584 images of 17 outdoor materials. We conclude by discussing its potentials for real-time use in HCI applications and future directions.
Abstract:The ability to monitor respiratory rate is extremely important for medical treatment, healthcare and fitness sectors. In many situations, mobile methods, which allow users to undertake every day activities, are required. However, current monitoring systems can be obtrusive, requiring users to wear respiration belts or nasal probes. Recent advances in thermographic systems have shrunk their size, weight and cost, to the point where it is possible to create smart-phone based respiration rate monitoring devices that are not affected by lighting conditions. However, mobile thermal imaging is challenged in scenes with high thermal dynamic ranges. This challenge is further amplified by general problems such as motion artifacts and low spatial resolution, leading to unreliable breathing signals. In this paper, we propose a novel and robust approach for respiration tracking which compensates for the negative effects of variations in the ambient temperature and motion artifacts and can accurately extract breathing rates in highly dynamic thermal scenes. It has three main contributions. The first is a novel Optimal Quantization technique which adaptively constructs a color mapping of absolute temperature to improve segmentation, classification and tracking. The second is the Thermal Gradient Flow method that computes thermal gradient magnitude maps to enhance accuracy of the nostril region tracking. Finally, we introduce the Thermal Voxel method to increase the reliability of the captured respiration signals compared to the traditional averaging method. We demonstrate the extreme robustness of our system to track the nostril-region and measure the respiratory rate in high dynamic range scenes.