Abstract:This two-part comprehensive survey is devoted to a computing framework most commonly known under the names Hyperdimensional Computing and Vector Symbolic Architectures (HDC/VSA). Both names refer to a family of computational models that use high-dimensional distributed representations and rely on the algebraic properties of their key operations to incorporate the advantages of structured symbolic representations and vector distributed representations. Notable models in the HDC/VSA family are Tensor Product Representations, Holographic Reduced Representations, Multiply-Add-Permute, Binary Spatter Codes, and Sparse Binary Distributed Representations but there are other models too. HDC/VSA is a highly interdisciplinary area with connections to computer science, electrical engineering, artificial intelligence, mathematics, and cognitive science. This fact makes it challenging to create a thorough overview of the area. However, due to a surge of new researchers joining the area in recent years, the necessity for a comprehensive survey of the area has become extremely important. Therefore, amongst other aspects of the area, this Part I surveys important aspects such as: known computational models of HDC/VSA and transformations of various input data types to high-dimensional distributed representations. Part II of this survey is devoted to applications, cognitive computing and architectures, as well as directions for future work. The survey is written to be useful for both newcomers and practitioners.
Abstract:Motivated by recent innovations in biologically-inspired neuromorphic hardware, this paper presents a novel unsupervised machine learning approach named Hyperseed that leverages Vector Symbolic Architectures (VSA) for fast learning a topology preserving feature map of unlabelled data. It relies on two major capabilities of VSAs: the binding operation and computing in superposition. In this paper, we introduce the algorithmic part of Hyperseed expressed within Fourier Holographic Reduced Representations VSA model, which is specifically suited for implementation on spiking neuromorphic hardware. The two distinctive novelties of the Hyperseed algorithm are: 1) Learning from only few input data samples and 2) A learning rule based on a single vector operation. These properties are demonstrated on synthetic datasets as well as on illustrative benchmark use-cases, IRIS classification and a language identification task using n-gram statistics.
Abstract:A change of the prevalent supervised learning techniques is foreseeable in the near future: from the complex, computational expensive algorithms to more flexible and elementary training ones. The strong revitalization of randomized algorithms can be framed in this prospect steering. We recently proposed a model for distributed classification based on randomized neural networks and hyperdimensional computing, which takes into account cost of information exchange between agents using compression. The use of compression is important as it addresses the issues related to the communication bottleneck, however, the original approach is rigid in the way the compression is used. Therefore, in this work, we propose a more flexible approach to compression and compare it to conventional compression algorithms, dimensionality reduction, and quantization techniques.
Abstract:This article reviews recent progress in the development of the computing framework Vector Symbolic Architectures (also known as Hyperdimensional Computing). This framework is well suited for implementation in stochastic, nanoscale hardware and it naturally expresses the types of cognitive operations required for Artificial Intelligence (AI). We demonstrate in this article that the ring-like algebraic structure of Vector Symbolic Architectures offers simple but powerful operations on high-dimensional vectors that can support all data structures and manipulations relevant in modern computing. In addition, we illustrate the distinguishing feature of Vector Symbolic Architectures, "computing in superposition," which sets it apart from conventional computing. This latter property opens the door to efficient solutions to the difficult combinatorial search problems inherent in AI applications. Vector Symbolic Architectures are Turing complete, as we show, and we see them acting as a framework for computing with distributed representations in myriad AI settings. This paper serves as a reference for computer architects by illustrating techniques and philosophy of VSAs for distributed computing and relevance to emerging computing hardware, such as neuromorphic computing.
Abstract:This correspondence comments on the findings reported in a recent Science Robotics article by Mitrokhin et al. [1]. The main goal of this commentary is to expand on some of the issues touched on in that article. Our experience is that hyperdimensional computing is very different from other approaches to computation and that it can take considerable exposure to its concepts before attaining practically useful understanding. Therefore, in order to provide an overview of the area to the first time reader of [1], the commentary includes a brief historic overview as well as connects the findings of the article to a larger body of literature existing in the area.
Abstract:Recent advances in Deep Learning have led to a significant performance increase on several NLP tasks, however, the models become more and more computationally demanding. Therefore, this paper tackles the domain of computationally efficient algorithms for NLP tasks. In particular, it investigates distributed representations of n-gram statistics of texts. The representations are formed using hyperdimensional computing enabled embedding. These representations then serve as features, which are used as input to standard classifiers. We investigate the applicability of the embedding on one large and three small standard datasets for classification tasks using nine classifiers. The embedding achieved on par F1 scores while decreasing the time and memory requirements by several times compared to the conventional n-gram statistics, e.g., for one of the classifiers on a small dataset, the memory reduction was 6.18 times; while train and test speed-ups were 4.62 and 3.84 times, respectively. For many classifiers on the large dataset, the memory reduction was about 100 times and train and test speed-ups were over 100 times. More importantly, the usage of distributed representations formed via hyperdimensional computing allows dissecting the strict dependency between the dimensionality of the representation and the parameters of n-gram statistics, thus, opening a room for tradeoffs.
Abstract:The deployment of machine learning algorithms on resource-constrained edge devices is an important challenge from both theoretical and applied points of view. In this article, we focus on resource-efficient randomly connected neural networks known as Random Vector Functional Link (RVFL) networks since their simple design and extremely fast training time make them very attractive for solving many applied classification tasks. We propose to represent input features via the density-based encoding known in the area of stochastic computing and use the operations of binding and bundling from the area of hyperdimensional computing for obtaining the activations of the hidden neurons. Using a collection of 121 real-world datasets from the UCI Machine Learning Repository, we empirically show that the proposed approach demonstrates higher average accuracy than the conventional RVFL. We also demonstrate that it is possible to represent the readout matrix using only integers in a limited range with minimal loss in the accuracy. In this case, the proposed approach operates only on small n-bits integers, which results in a computationally efficient architecture. Finally, through hardware FPGA implementations, we show that such an approach consumes approximately eleven times less energy than that of the conventional RVFL.
Abstract:We propose an approximation of Echo State Networks (ESN) that can be efficiently implemented on digital hardware based on the mathematics of hyperdimensional computing. The reservoir of the proposed Integer Echo State Network (intESN) is a vector containing only n-bits integers (where n<8 is normally sufficient for a satisfactory performance). The recurrent matrix multiplication is replaced with an efficient cyclic shift operation. The intESN architecture is verified with typical tasks in reservoir computing: memorizing of a sequence of inputs; classifying time-series; learning dynamic processes. Such an architecture results in dramatic improvements in memory footprint and computational efficiency, with minimal performance loss.
Abstract:This article proposes the use of Vector Symbolic Architectures for implementing Hierarchical Graph Neuron, an architecture for memorizing patterns of generic sensor stimuli. The adoption of a Vector Symbolic representation ensures a one-layered design for the approach, while maintaining the previously reported properties and performance characteristics of Hierarchical Graph Neuron, and also improving the noise resistance of the architecture. The proposed architecture enables a linear (with respect to the number of stored entries) time search for an arbitrary sub-pattern.