Abstract:Human-machine teaming in medical AI requires us to understand to what degree a trained clinician should weigh AI predictions. While previous work has shown the potential of AI assistance at improving clinical predictions, existing clinical decision support systems either provide no explainability of their predictions or use techniques like saliency and Shapley values, which do not allow for physician-based verification. To address this gap, this study compares previously used explainable AI techniques with a newly proposed technique termed '2-factor retrieval (2FR)', which is a combination of interface design and search retrieval that returns similarly labeled data without processing this data. This results in a 2-factor security blanket where: (a) correct images need to be retrieved by the AI; and (b) humans should associate the retrieved images with the current pathology under test. We find that when tested on chest X-ray diagnoses, 2FR leads to increases in clinician accuracy, with particular improvements when clinicians are radiologists and have low confidence in their decision. Our results highlight the importance of understanding how different modes of human-AI decision making may impact clinician accuracy in clinical decision support systems.
Abstract:Polysomnography (PSG), the current gold standard method for monitoring and detecting sleep disorders, is cumbersome and costly. At-home testing solutions, known as home sleep apnea testing (HSAT), exist. However, they are contact-based, a feature which limits the ability of some patient populations to tolerate testing and discourages widespread deployment. Previous work on non-contact sleep monitoring for sleep apnea detection either estimates respiratory effort using radar or nasal airflow using a thermal camera, but has not compared the two or used them together. We conducted a study on 10 participants, ages 34 - 78, with suspected sleep disorders using a hardware setup with a synchronized radar and thermal camera. We show the first comparison of radar and thermal imaging for sleep monitoring, and find that our thermal imaging method outperforms radar significantly. Our thermal imaging method detects apneas with an accuracy of 0.99, a precision of 0.68, a recall of 0.74, an F1 score of 0.71, and an intra-class correlation of 0.70; our radar method detects apneas with an accuracy of 0.83, a precision of 0.13, a recall of 0.86, an F1 score of 0.22, and an intra-class correlation of 0.13. We also present a novel proposal for classifying obstructive and central sleep apnea by leveraging a multimodal setup. This method could be used accurately detect and classify apneas during sleep with non-contact sensors, thereby improving diagnostic capacities in patient populations unable to tolerate current technology.
Abstract:The exponential progress in generative AI poses serious implications for the credibility of all real images and videos. There will exist a point in the future where 1) digital content produced by generative AI will be indistinguishable from those created by cameras, 2) high-quality generative algorithms will be accessible to anyone, and 3) the ratio of all synthetic to real images will be large. It is imperative to establish methods that can separate real data from synthetic data with high confidence. We define real images as those that were produced by the camera hardware, capturing a real-world scene. Any synthetic generation of an image or alteration of a real image through generative AI or computer graphics techniques is labeled as a synthetic image. To this end, this document aims to: present known strategies in detection and cryptography that can be employed to verify which images are real, weight the strengths and weaknesses of these strategies, and suggest additional improvements to alleviate shortcomings.
Abstract:With the onset of diffusion-based generative models and their ability to generate text-conditioned images, content generation has received a massive invigoration. Recently, these models have been shown to provide useful guidance for the generation of 3D graphics assets. However, existing work in text-conditioned 3D generation faces fundamental constraints: (i) inability to generate detailed, multi-object scenes, (ii) inability to textually control multi-object configurations, and (iii) physically realistic scene composition. In this work, we propose CG3D, a method for compositionally generating scalable 3D assets that resolves these constraints. We find that explicit Gaussian radiance fields, parameterized to allow for compositions of objects, possess the capability to enable semantically and physically consistent scenes. By utilizing a guidance framework built around this explicit representation, we show state of the art results, capable of even exceeding the guiding diffusion model in terms of object combinations and physics accuracy.
Abstract:Thermal cameras and thermal point detectors are used to measure the temperature of human skin. These are important devices that are used everyday in clinical and mass screening settings, particularly in an epidemic. Unfortunately, despite the wide use of thermal sensors, the temperature estimates from thermal sensors do not work well in uncontrolled scene conditions. Previous work has studied the effect of wind and other environment factors on skin temperature, but has not considered the heating effect from sunlight, which is termed solar loading. Existing device manufacturers recommend that a subject who has been outdoors in sun re-acclimate to an indoor environment after a waiting period. The waiting period, up to 30 minutes, is insufficient for a rapid screening tool. Moreover, the error bias from solar loading is greater for darker skin tones since melanin absorbs solar radiation. This paper explores two approaches to address this problem. The first approach uses transient behavior of cooling to more quickly extrapolate the steady state temperature. A second approach explores the spatial modulation of solar loading, to propose single-shot correction with a wide-field thermal camera. A real world dataset comprising of thermal point, thermal image, subjective, and objective measurements of melanin is collected with statistical significance for the effect size observed. The single-shot correction scheme is shown to eliminate solar loading bias in the time of a typical frame exposure (33ms).