Abstract:Bayesian Optimization Mixed-Precision Neural Architecture Search (BOMP-NAS) is an approach to quantization-aware neural architecture search (QA-NAS) that leverages both Bayesian optimization (BO) and mixed-precision quantization (MP) to efficiently search for compact, high performance deep neural networks. The results show that integrating quantization-aware fine-tuning (QAFT) into the NAS loop is a necessary step to find networks that perform well under low-precision quantization: integrating it allows a model size reduction of nearly 50\% on the CIFAR-10 dataset. BOMP-NAS is able to find neural networks that achieve state of the art performance at much lower design costs. This study shows that BOMP-NAS can find these neural networks at a 6x shorter search time compared to the closest related work.
Abstract:A key enabler of deploying convolutional neural networks on resource-constrained embedded systems is the binary neural network (BNN). BNNs save on memory and simplify computation by binarizing both features and weights. Unfortunately, binarization is inevitably accompanied by a severe decrease in accuracy. To reduce the accuracy gap between binary and full-precision networks, many repair methods have been proposed in the recent past, which we have classified and put into a single overview in this chapter. The repair methods are divided into two main branches, training techniques and network topology changes, which can further be split into smaller categories. The latter category introduces additional cost (energy consumption or additional area) for an embedded system, while the former does not. From our overview, we observe that progress has been made in reducing the accuracy gap, but BNN papers are not aligned on what repair methods should be used to get highly accurate BNNs. Therefore, this chapter contains an empirical review that evaluates the benefits of many repair methods in isolation over the ResNet-20\&CIFAR10 and ResNet-18\&CIFAR100 benchmarks. We found three repair categories most beneficial: feature binarizer, feature normalization, and double residual. Based on this review we discuss future directions and research opportunities. We sketch the benefit and costs associated with BNNs on embedded systems because it remains to be seen whether BNNs will be able to close the accuracy gap while staying highly energy-efficient on resource-constrained embedded systems.