Abstract:High-level synthesis (HLS) has significantly advanced the automation of digital circuits design, yet the need for expertise and time in pragma tuning remains challenging. Existing solutions for the design space exploration (DSE) adopt either heuristic methods, lacking essential information for further optimization potential, or predictive models, missing sufficient generalization due to the time-consuming nature of HLS and the exponential growth of the design space. To address these challenges, we propose Deep Inverse Design for HLS (DID4HLS), a novel approach that integrates graph neural networks and generative models. DID4HLS iteratively optimizes hardware designs aimed at compute-intensive algorithms by learning conditional distributions of design features from post-HLS data. Compared to four state-of-the-art DSE baselines, our method achieved an average improvement of 42.5% on average distance to reference set (ADRS) compared to the best-performing baselines across six benchmarks, while demonstrating high robustness and efficiency.
Abstract:Knowledge distillation is a powerful technique to compress large neural networks into smaller, more efficient networks. Softmax regression representation learning is a popular approach that uses a pre-trained teacher network to guide the learning of a smaller student network. While several studies explored the effectiveness of softmax regression representation learning, the underlying mechanism that provides knowledge transfer is not well understood. This paper presents Ideal Joint Classifier Knowledge Distillation (IJCKD), a unified framework that provides a clear and comprehensive understanding of the existing knowledge distillation methods and a theoretical foundation for future research. Using mathematical techniques derived from a theory of domain adaptation, we provide a detailed analysis of the student network's error bound as a function of the teacher. Our framework enables efficient knowledge transfer between teacher and student networks and can be applied to various applications.
Abstract:Time series probabilistic forecasting with multi-dimensional and sporadic data (known as sparse data) has potential to implement monitoring kinds of physiological indices of patients in Intensive Care Unit (ICU). In this paper, we propose Transformer-based Diffusion probabilistic model for Sparse Time series Forecasting (TDSTF), a new model to predict distribution of highly sparse time series. There are many works that focus on probabilistic forecasting, but few of them avoid noise that come from extreme sparsity of data. We take advantage of Triplet, a data organization that represents sparse time series in a much efficient way, for our model to bypass data redundancy in the traditional matrix form. The proposed model performed better on MIMIC-III ICU dataset than the current state-of-the-art probabilistic forecasting models. We obtained normalized average continuous ranked probability score (CRPS) of $\mathbf{0.4379}$, and mean squared error (MSE) of $\mathbf{0.4008}$ when adopting the median of the model samplings as the deterministic forecasting. Our code is provided at https://github.com/PingChang818/TDSTF.