Abstract:Background: Racial and ethnic minority groups and individuals facing social disadvantages, which often stem from their social determinants of health (SDoH), bear a disproportionate burden of type 2 diabetes (T2D) and its complications. It is therefore crucial to implement effective social risk management strategies at the point of care. Objective: To develop an EHR-based machine learning (ML) analytical pipeline to identify the unmet social needs associated with hospitalization risk in patients with T2D. Methods: We identified 10,192 T2D patients from the EHR data (from 2012 to 2022) from the University of Florida Health Integrated Data Repository, including contextual SDoH (e.g., neighborhood deprivation) and individual-level SDoH (e.g., housing stability). We developed an electronic health records (EHR)-based machine learning (ML) analytic pipeline, namely individualized polysocial risk score (iPsRS), to identify high social risk associated with hospitalizations in T2D patients, along with explainable AI (XAI) techniques and fairness assessment and optimization. Results: Our iPsRS achieved a C statistic of 0.72 in predicting 1-year hospitalization after fairness optimization across racial-ethnic groups. The iPsRS showed excellent utility for capturing individuals at high hospitalization risk; the actual 1-year hospitalization rate in the top 5% of iPsRS was ~13 times as high as the bottom decile. Conclusion: Our ML pipeline iPsRS can fairly and accurately screen for patients who have increased social risk leading to hospitalization in T2D patients.
Abstract:Combining high-level and low-level visual tasks is a common technique in the field of computer vision. This work integrates the technique of image super resolution to semantic segmentation for document image binarization. It demonstrates that using image super-resolution as a preprocessing step can effectively enhance the results and performance of semantic segmentation.
Abstract:To efficiently extract the textual information from color degraded document images is an important research topic. Long-term imperfect preservation of ancient documents has led to various types of degradation such as page staining, paper yellowing, and ink bleeding; these degradations badly impact the image processing for information extraction. In this paper, we present CCDWT-GAN, a generative adversarial network (GAN) that utilizes the discrete wavelet transform (DWT) on RGB (red, green, blue) channel splited images. The proposed method comprises three stages: image preprocessing, image enhancement, and image binarization. This work conducts comparative experiments in the image preprocessing stage to determine the optimal selection of DWT with normalization. Additionally, we perform an ablation study on the results of the image enhancement stage and the image binarization stage to validate their positive effect on the model performance. This work compares the performance of the proposed method with other state-of-the-art (SOTA) methods on DIBCO and H-DIBCO ((Handwritten) Document Image Binarization Competition) datasets. The experimental results demonstrate that CCDWT-GAN achieves a top two performance on multiple benchmark datasets, and outperforms other SOTA methods.
Abstract:Object detection and single image super-resolution are classic problems in computer vision (CV). The object detection task aims to recognize the objects in input images, while the image restoration task aims to reconstruct high quality images from given low quality images. In this paper, a two-stage framework for object detection and image restoration is proposed. The first stage uses YOLO series algorithms to complete the object detection and then performs image cropping. In the second stage, this work improves Swin Transformer and uses the new proposed algorithm to connect the Swin Transformer layer to design a new neural network architecture. We name the newly proposed network for image restoration SwinOIR. This work compares the model performance of different versions of YOLO detection algorithms on MS COCO dataset and Pascal VOC dataset, demonstrating the suitability of different YOLO network models for the first stage of the framework in different scenarios. For image super-resolution task, it compares the model performance of using different methods of connecting Swin Transformer layers and design different sizes of SwinOIR for use in different life scenarios. Our implementation code is released at https://github.com/Rubbbbbbbbby/SwinOIR.
Abstract:Previous gait phase detection as convolutional neural network (CNN) based classification task requires cumbersome manual setting of time delay or heavy overlapped sliding windows to accurately classify each phase under different test cases, which is not suitable for streaming Inertial-Measurement-Unit (IMU) sensor data and fails to adapt to different scenarios. This paper presents a segmentation based gait phase detection with only a single six-axis IMU sensor, which can easily adapt to both walking and running at various speeds. The proposed segmentation uses CNN with gait phase aware receptive field setting and IMU oriented processing order, which can fit to high sampling rate of IMU up to 1000Hz for high accuracy and low sampling rate down to 20Hz for real time calculation. The proposed model on the 20Hz sampling rate data can achieve average error of 8.86 ms in swing time, 9.12 ms in stance time and 96.44\% accuracy of gait phase detection and 99.97\% accuracy of stride detection. Its real-time implementation on mobile phone only takes 36 ms for 1 second length of sensor data.