Abstract:This paper introduces FSL-HDnn, an energy-efficient accelerator that implements the end-to-end pipeline of feature extraction, classification, and on-chip few-shot learning (FSL) through gradient-free learning techniques in a 40 nm CMOS process. At its core, FSL-HDnn integrates two low-power modules: Weight clustering feature extractor and Hyperdimensional Computing (HDC). Feature extractor utilizes advanced weight clustering and pattern reuse strategies for optimized CNN-based feature extraction. Meanwhile, HDC emerges as a novel approach for lightweight FSL classifier, employing hyperdimensional vectors to improve training accuracy significantly compared to traditional distance-based approaches. This dual-module synergy not only simplifies the learning process by eliminating the need for complex gradients but also dramatically enhances energy efficiency and performance. Specifically, FSL-HDnn achieves an Intensity unprecedented energy efficiency of 5.7 TOPS/W for feature 1 extraction and 0.78 TOPS/W for classification and learning Training Intensity phases, achieving improvements of 2.6X and 6.6X, respectively, Storage over current state-of-the-art CNN and FSL processors.
Abstract:Feed recommendation models are widely adopted by numerous feed platforms to encourage users to explore the contents they are interested in. However, most of the current research simply focus on targeting user's preference and lack in-depth study of avoiding objectionable contents to be frequently recommended, which is a common reason that let user detest. To address this issue, we propose a Deep Latent Emotion Network (DLEN) model to extract latent probability of a user preferring a feed by modeling multiple targets with semi-supervised learning. With this method, the conflicts of different targets are successfully reduced in the training phase, which improves the training accuracy of each target effectively. Besides, by adding this latent state of user emotion to multi-target fusion, the model is capable of decreasing the probability to recommend objectionable contents to improve user retention and stay time during online testing phase. DLEN is deployed on a real-world multi-task feed recommendation scenario of Tencent QQ-Small-World with a dataset containing over a billion samples, and it exhibits a significant performance advantage over the SOTA MTL model in offline evaluation, together with a considerable increase by 3.02% in view-count and 2.63% in user stay-time in production. Complementary offline experiments of DLEN model on a public dataset also repeat improvements in various scenarios. At present, DLEN model has been successfully deployed in Tencent's feed recommendation system.