Abstract:Tensor program optimization on Deep Learning Accelerators (DLAs) is critical for efficient model deployment. Although search-based Deep Learning Compilers (DLCs) have achieved significant performance gains compared to manual methods, they still suffer from the persistent challenges of low search efficiency and poor cross-platform adaptability. In this paper, we propose $\textbf{Pruner}$, following hardware/software co-design principles to hierarchically boost tensor program optimization. Pruner comprises two primary components: a Parameterized Static Analyzer ($\textbf{PSA}$) and a Pattern-aware Cost Model ($\textbf{PaCM}$). The former serves as a hardware-aware and formulaic performance analysis tool, guiding the pruning of the search space, while the latter enables the performance prediction of tensor programs according to the critical data-flow patterns. Furthermore, to ensure effective cross-platform adaptation, we design a Momentum Transfer Learning ($\textbf{MTL}$) strategy using a Siamese network, which establishes a bidirectional feedback mechanism to improve the robustness of the pre-trained cost model. The extensive experimental results demonstrate the effectiveness and advancement of the proposed Pruner in various tensor program tuning tasks across both online and offline scenarios, with low resource overhead. The code is available at https://github.com/qiaolian9/Pruner.
Abstract:The ongoing global pandemic of Coronavirus Disease 2019 (COVID-19) has posed serious threat to public health and the economy. Rapid and accurate diagnosis of COVID-19 is crucial to prevent the further spread of the disease and reduce its mortality. Chest computed tomography (CT) is an effective tool for the early diagnosis of lung diseases including pneumonia. However, detecting COVID-19 from CT is demanding and prone to human errors as some early-stage patients may have negative findings on images. In this study, we propose a novel residual network to automatically identify COVID-19 from other common pneumonia and normal people using CT images. Specifically, we employ the modified 3D ResNet18 as the backbone network, which is equipped with both channel-wise attention (CA) and depth-wise attention (DA) modules to further improve the diagnostic performance. Experimental results on the large open-source dataset show that our method can differentiate COVID-19 from the other two classes with 94.7% accuracy, 93.73% sensitivity, 98.28% specificity, 95.26% F1-score, and an area under the receiver operating characteristic curve (AUC) of 0.99, outperforming baseline methods. These results demonstrate that the proposed method could potentially assist the clinicians in performing a quick diagnosis to fight COVID-19.