Abstract:Knowledge distillation is a technique to imitate a performance that a deep learning model has, but reduce the size on another model. It applies the outputs of a model to train another model having comparable accuracy. These two distinct models are similar to the way information is delivered in human society, with one acting as the "teacher" and the other as the "student". Softmax plays a role in comparing logits generated by models with each other by converting probability distributions. It delivers the logits of a teacher to a student with compression through a parameter named temperature. Tuning this variable reinforces the distillation performance. Although only this parameter helps with the interaction of logits, it is not clear how temperatures promote information transfer. In this paper, we propose a novel approach to calculate the temperature. Our method only refers to the maximum logit generated by a teacher model, which reduces computational time against state-of-the-art methods. Our method shows a promising result in different student and teacher models on a standard benchmark dataset. Algorithms using temperature can obtain the improvement by plugging in this dynamic approach. Furthermore, the approximation of the distillation process converges to a correlation of logits by both models. This reinforces the previous argument that the distillation conveys the relevance of logits. We report that this approximating algorithm yields a higher temperature compared to the commonly used static values in testing.
Abstract:Eosinophilic esophagitis (EoE) is a chronic, food antigen-driven, allergic inflammatory condition of the esophagus associated with elevated esophageal eosinophils. EoE is a top cause of chronic dysphagia after GERD. Diagnosis of EoE relies on counting eosinophils in histological slides, a manual and time-consuming task that limits the ability to extract complex patient-dependent features. The treatment of EoE includes medication and food elimination. A personalized food elimination plan is crucial for engagement and efficiency, but previous attempts failed to produce significant results. In this work, on the one hand, we utilize AI for inferring histological features from the entire biopsy slide, features that cannot be extracted manually. On the other hand, we develop causal learning models that can process this wealth of data. We applied our approach to the 'Six-Food vs. One-Food Eosinophilic Esophagitis Diet Study', where 112 symptomatic adults aged 18-60 years with active EoE were assigned to either a six-food elimination diet (6FED) or a one-food elimination diet (1FED) for six weeks. Our results show that the average treatment effect (ATE) of the 6FED treatment compared with the 1FED treatment is not significant, that is, neither diet was superior to the other. We examined several causal models and show that the best treatment strategy was obtained using T-learner with two XGBoost modules. While 1FED only and 6FED only provide improvement for 35%-38% of the patients, which is not significantly different from a random treatment assignment, our causal model yields a significantly better improvement rate of 58.4%. This study illustrates the significance of AI in enhancing treatment planning by analyzing molecular features' distribution in histological slides through causal learning. Our approach can be harnessed for other conditions that rely on histology for diagnosis and treatment.