Abstract:We present two studies exploring how blind or low-vision (BLV) family members engage with their sighted children's artwork, strategies to support understanding and interpretation, and the potential role of technology, such as AI, therein. Our first study involved 14 BLV individuals, and the second included five groups of BLV individuals with their children. Through semi-structured interviews with AI descriptions of children's artwork and multi-sensory design probes, we found that BLV family members value artwork engagement as a bonding opportunity, preferring the child's storytelling and interpretation over other nonvisual representations. Additionally, despite some inaccuracies, BLV family members felt that AI-generated descriptions could facilitate dialogue with their children and aid self-guided art discovery. We close with specific design considerations for supporting artwork engagement in mixed visual-ability families, including enabling artwork access through various methods, supporting children's corrections of AI output, and distinctions in context vs. content and interpretation vs. description of children's artwork.
Abstract:Crowdsourcing platforms have transformed distributed problem-solving, yet quality control remains a persistent challenge. Traditional quality control measures, such as prescreening workers and refining instructions, often focus solely on optimizing economic output. This paper explores just-in-time AI interventions to enhance both labeling quality and domain-specific knowledge among crowdworkers. We introduce LabelAId, an advanced inference model combining Programmatic Weak Supervision (PWS) with FT-Transformers to infer label correctness based on user behavior and domain knowledge. Our technical evaluation shows that our LabelAId pipeline consistently outperforms state-of-the-art ML baselines, improving mistake inference accuracy by 36.7% with 50 downstream samples. We then implemented LabelAId into Project Sidewalk, an open-source crowdsourcing platform for urban accessibility. A between-subjects study with 34 participants demonstrates that LabelAId significantly enhances label precision without compromising efficiency while also increasing labeler confidence. We discuss LabelAId's success factors, limitations, and its generalizability to other crowdsourced science domains.
Abstract:We demonstrate the use of DeCAF and Improved Fisher Vector image features to classify clothing texture. The issue of choosing clothes is a problem for the blind every day. This work attempts to solve the issue with a finger-mounted camera and state-of-the-art classification algorithms. To evaluate our solution, we collected 520 close-up images across 29 pieces of clothing. We contribute (1) the HCTD, an image dataset taken with a NanEyeGS camera, a camera small enough to be mounted on the finger, and (2) evaluations of state-of-the-art recognition algorithms applied to our dataset - achieving an accuracy >95%. Throughout the paper, we will discuss previous work, evaluate the current work, and finally, suggest the project's future direction.
Abstract:To help improve the safety and accessibility of indoor spaces, researchers and health professionals have created assessment instruments that enable homeowners and trained experts to audit and improve homes. With advances in computer vision, augmented reality (AR), and mobile sensors, new approaches are now possible. We introduce RASSAR (Room Accessibility and Safety Scanning in Augmented Reality), a new proof-of-concept prototype for semi-automatically identifying, categorizing, and localizing indoor accessibility and safety issues using LiDAR + camera data, machine learning, and AR. We present an overview of the current RASSAR prototype and a preliminary evaluation in a single home.
Abstract:There is a lack of data on the location, condition, and accessibility of sidewalks across the world, which not only impacts where and how people travel but also fundamentally limits interactive mapping tools and urban analytics. In this paper, we describe initial work in semi-automatically building a sidewalk network topology from satellite imagery using hierarchical multi-scale attention models, inferring surface materials from street-level images using active learning-based semantic segmentation, and assessing sidewalk condition and accessibility features using Crowd+AI. We close with a call to create a database of labeled satellite and streetscape scenes for sidewalks and sidewalk accessibility issues along with standardized benchmarks.
Abstract:Recent advances have enabled automatic sound recognition systems for deaf and hard of hearing (DHH) users on mobile devices. However, these tools use pre-trained, generic sound recognition models, which do not meet the diverse needs of DHH users. We introduce ProtoSound, an interactive system for customizing sound recognition models by recording a few examples, thereby enabling personalized and fine-grained categories. ProtoSound is motivated by prior work examining sound awareness needs of DHH people and by a survey we conducted with 472 DHH participants. To evaluate ProtoSound, we characterized performance on two real-world sound datasets, showing significant improvement over state-of-the-art (e.g., +9.7% accuracy on the first dataset). We then deployed ProtoSound's end-user training and real-time recognition through a mobile application and recruited 19 hearing participants who listened to the real-world sounds and rated the accuracy across 56 locations (e.g., homes, restaurants, parks). Results show that ProtoSound personalized the model on-device in real-time and accurately learned sounds across diverse acoustic contexts. We close by discussing open challenges in personalizable sound recognition, including the need for better recording interfaces and algorithmic improvements.