Abstract:Knowledge distillation (KD) enhances the performance of a student network by allowing it to learn the knowledge transferred from a teacher network incrementally. Existing methods dynamically adjust the temperature to enable the student network to adapt to the varying learning difficulties at different learning stages of KD. KD is a continuous process, but when adjusting the temperature, these methods consider only the immediate benefits of the operation in the current learning phase and fail to take into account its future returns. To address this issue, we formulate the adjustment of temperature as a sequential decision-making task and propose a method based on reinforcement learning, termed RLKD. Importantly, we design a novel state representation to enable the agent to make more informed action (i.e. instance temperature adjustment). To handle the problem of delayed rewards in our method due to the KD setting, we explore an instance reward calibration approach. In addition,we devise an efficient exploration strategy that enables the agent to learn valuable instance temperature adjustment policy more efficiently. Our framework can serve as a plug-and-play technique to be inserted into various KD methods easily, and we validate its effectiveness on both image classification and object detection tasks. Our code is at https://github.com/Zhengbo-Zhang/ITKD
Abstract:Chronic Obstructive Pulmonary Disease (COPD) is a chronic inflammatory lung condition that causes airflow obstruction. The existing methods can only detect patients who already have COPD based on obvious features shown in the spirogram (In this article, the spirogram specifically involves measuring Volume-Flow curve time series). Early prediction of COPD risk is vital for monitoring COPD disease progression, slowing it down, or even preventing its onset. However, these methods fail to early predict an individual's probability of COPD in the future based on subtle features in the spirogram. To address this gap, for the first time, we propose DeepSpiro, a method based on deep learning for early prediction of future COPD risk. DeepSpiro consists of four parts. First, we construct Volume-Flow curves guided by Time-Volume instability smoothing (SpiroSmoother) to enhance the stability of the original Volume-Flow curves precisely. Second, we extract critical features from the evolution of varied-length key patches (SpiroEncoder) to capture the key temporal evolution from original high-dimensional dynamic sequences to a unified low-dimensional temporal representation. Third, we explain the model based on temporal attention and heterogeneous feature fusion (SpiroExplainer), which integrates information from heterogeneous data such as spirogram and demographic information. Fourth, we predict the risk of COPD based on the evolution of key patch concavity (SpiroPredictor), enabling accurate prediction of the risk of disease in high-risk patients who are not yet diagnosed, for up to 1, 2, 3, 4, 5 years, and beyond. We conduct experiments on the UK Biobank dataset. Results show that DeepSpiro achieves an AUC value of 0.8328 in the task of detecting COPD. In early prediction tasks, high-risk and low-risk groups show significant differences in the future, with a p-value of <0.001.
Abstract:Atrial Fibrillation (AF) is a common cardiac arrhythmia. Many AF patients experience complications such as stroke and other cardiovascular issues. Early detection of AF is crucial. Existing algorithms can only distinguish ``AF rhythm in AF patients'' from ``sinus rhythm in normal individuals'' . However, AF patients do not always exhibit AF rhythm, posing a challenge for diagnosis when the AF rhythm is absent. To address this, this paper proposes a novel artificial intelligence (AI) algorithm to distinguish ``sinus rhythm in AF patients'' and ``sinus rhythm in normal individuals'' in beat-level. We introduce beat-level risk interpreters, trend risk interpreters, addressing the interpretability issues of deep learning models and the difficulty in explaining AF risk trends. Additionally, the beat-level information fusion decision is presented to enhance model accuracy. The experimental results demonstrate that the average AUC for single beats used as testing data from CPSC 2021 dataset is 0.7314. By employing 150 beats for information fusion decision algorithm, the average AUC can reach 0.7591. Compared to previous segment-level algorithms, we utilized beats as input, reducing data dimensionality and making the model more lightweight, facilitating deployment on portable medical devices. Furthermore, we draw new and interesting findings through average beat analysis and subgroup analysis, considering varying risk levels.
Abstract:In this work, we are dedicated to leveraging the denoising diffusion models' success and formulating feature refinement as the autoencoder-formed diffusion process. The state-of-the-art CSLR framework consists of a spatial module, a visual module, a sequence module, and a sequence learning function. However, this framework has faced sequence module overfitting caused by the objective function and small-scale available benchmarks, resulting in insufficient model training. To overcome the overfitting problem, some CSLR studies enforce the sequence module to learn more visual temporal information or be guided by more informative supervision to refine its representations. In this work, we propose a novel autoencoder-formed conditional diffusion feature refinement~(ACDR) to refine the sequence representations to equip desired properties by learning the encoding-decoding optimization process in an end-to-end way. Specifically, for the ACDR, a noising Encoder is proposed to progressively add noise equipped with semantic conditions to the sequence representations. And a denoising Decoder is proposed to progressively denoise the noisy sequence representations with semantic conditions. Therefore, the sequence representations can be imbued with the semantics of provided semantic conditions. Further, a semantic constraint is employed to prevent the denoised sequence representations from semantic corruption. Extensive experiments are conducted to validate the effectiveness of our ACDR, benefiting state-of-the-art methods and achieving a notable gain on three benchmarks.
Abstract:Sepsis is a leading cause of death in the ICU. It is a disease requiring complex interventions in a short period of time, but its optimal treatment strategy remains uncertain. Evidence suggests that the practices of currently used treatment strategies are problematic and may cause harm to patients. To address this decision problem, we propose a new medical decision model based on historical data to help clinicians recommend the best reference option for real-time treatment. Our model combines offline reinforcement learning with deep reinforcement learning to address the problem that traditional reinforcement learning in healthcare cannot interact with the environment, enabling our model to make decisions in a continuous state-action space. We demonstrate that, on average, the treatments recommended by the model are more valuable and reliable than those recommended by clinicians. In a large validation dataset, we found that patients whose actual doses from clinicians matched the AI's decisions had the lowest mortality rates. Our model provides personalized, clinically interpretable treatment decisions for sepsis that can improve patient care.
Abstract:Infrared small target detection plays an important role in the infrared search and tracking applications. In recent years, deep learning techniques were introduced to this task and achieved noteworthy effects. Following general object segmentation methods, existing deep learning methods usually processed the image from the global view. However, the imaging locality of small targets and extreme class-imbalance between the target and background pixels were not well-considered by these deep learning methods, which causes the low-efficiency on training and high-dependence on numerous data. A multi-patch attention network (MANet) is proposed in this paper to detect small targets by jointly considering the global and local properties of infrared small target images. From the global view, a supervised attention module trained by the small target spread map is proposed to suppress most background pixels irrelevant with small target features. From the local view, local patches are split from global features and share the same convolution weights with each other in a patch net. By synthesizing the global and local properties, the data-driven framework proposed in this paper has fused multi-scale features for small target detection. Extensive synthetic and real data experiments show that the proposed method achieves the state-of-the-art performance compared with existing both conventional and deep learning methods.
Abstract:Objective: To conduct a systematic review of deep learning methods on Electrocardiogram (ECG) data from the perspective of model architecture and their application task. Methods: First, we extensively searched papers deploying deep learning (deep neural network networks) on ECG data that published between January 1st 2010 and September 30th 2019 from Google Scholar, PubMed and DBLP. Then we analyze them in three aspects including task, model and data. Finally, we conclude unresolved challenges and problems that existing models can not handle well. Results: The total number of papers is 124, among them 97 papers are published after in recent two years. Almost all kinds of common deep learning architectures have been used in ECG analytics tasks like disease detection/classification, annotation/localization, sleep staging, biometric human identification, denoising and so on. Conclusion: The number of works about deep learning on ECG data is growing explosively in recent years. Indeed, these works have achieve a far more better performance in terms of accuracy. However, there are some new challenges and problems like interpretability, scalability, efficiency, which need to be addressed and paid more attention. Moreover, it is also worth to investigate by discovering new interesting applications from both the dataset view and the method view. Significance: This paper summarizes existing deep learning methods on modeling ECG data from multiple views, while also point out existing challenges and problems, while can become potential research direction in the future.