Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Vladimir Korviakov

NeoNeXt: Novel neural network operator and architecture based on the patch-wise matrix multiplications

Mar 17, 2024

Vladimir Korviakov, Denis Koposov

Abstract:Most of the computer vision architectures nowadays are built upon the well-known foundation operations: fully-connected layers, convolutions and multi-head self-attention blocks. In this paper we propose a novel foundation operation - NeoCell - which learns matrix patterns and performs patchwise matrix multiplications with the input data. The main advantages of the proposed operator are (1) simple implementation without need in operations like im2col, (2) low computational complexity (especially for large matrices) and (3) simple and flexible implementation of up-/down-sampling. We validate NeoNeXt family of models based on this operation on ImageNet-1K classification task and show that they achieve competitive quality.

Via

Access Paper or Ask Questions

Neural Tangent Kernel: A Survey

Aug 29, 2022

Eugene Golikov, Eduard Pokonechnyy, Vladimir Korviakov

Figure 1 for Neural Tangent Kernel: A Survey

Figure 2 for Neural Tangent Kernel: A Survey

Figure 3 for Neural Tangent Kernel: A Survey

Figure 4 for Neural Tangent Kernel: A Survey

Abstract:A seminal work [Jacot et al., 2018] demonstrated that training a neural network under specific parameterization is equivalent to performing a particular kernel method as width goes to infinity. This equivalence opened a promising direction for applying the results of the rich literature on kernel methods to neural nets which were much harder to tackle. The present survey covers key results on kernel convergence as width goes to infinity, finite-width corrections, applications, and a discussion of the limitations of the corresponding method.

* 47 pages, 8 figures

Via

Access Paper or Ask Questions

ISyNet: Convolutional Neural Networks design for AI accelerator

Sep 04, 2021

Alexey Letunovskiy, Vladimir Korviakov, Vladimir Polovnikov, Anastasiia Kargapoltseva, Ivan Mazurenko, Yepan Xiong

Figure 1 for ISyNet: Convolutional Neural Networks design for AI accelerator

Figure 2 for ISyNet: Convolutional Neural Networks design for AI accelerator

Figure 3 for ISyNet: Convolutional Neural Networks design for AI accelerator

Figure 4 for ISyNet: Convolutional Neural Networks design for AI accelerator

Abstract:In recent years Deep Learning reached significant results in many practical problems, such as computer vision, natural language processing, speech recognition and many others. For many years the main goal of the research was to improve the quality of models, even if the complexity was impractically high. However, for the production solutions, which often require real-time work, the latency of the model plays a very important role. Current state-of-the-art architectures are found with neural architecture search (NAS) taking model complexity into account. However, designing of the search space suitable for specific hardware is still a challenging task. To address this problem we propose a measure of hardware efficiency of neural architecture search space - matrix efficiency measure (MEM); a search space comprising of hardware-efficient operations; a latency-aware scaling method; and ISyNet - a set of architectures designed to be fast on the specialized neural processing unit (NPU) hardware and accurate at the same time. We show the advantage of the designed architectures for the NPU devices on ImageNet and the generalization ability for the downstream classification and detection tasks.

* 13 pages, 5 figures

Via

Access Paper or Ask Questions

Tensor Yard: One-Shot Algorithm of Hardware-Friendly Tensor-Train Decomposition for Convolutional Neural Networks

Aug 09, 2021

Anuar Taskynov, Vladimir Korviakov, Ivan Mazurenko, Yepan Xiong

Figure 1 for Tensor Yard: One-Shot Algorithm of Hardware-Friendly Tensor-Train Decomposition for Convolutional Neural Networks

Figure 2 for Tensor Yard: One-Shot Algorithm of Hardware-Friendly Tensor-Train Decomposition for Convolutional Neural Networks

Figure 3 for Tensor Yard: One-Shot Algorithm of Hardware-Friendly Tensor-Train Decomposition for Convolutional Neural Networks

Figure 4 for Tensor Yard: One-Shot Algorithm of Hardware-Friendly Tensor-Train Decomposition for Convolutional Neural Networks

Abstract:Nowadays Deep Learning became widely used in many economic, technical and scientific areas of human interest. It is clear that efficiency of solutions based on Deep Neural Networks should consider not only quality metric for the target task, but also latency and constraints of target platform design should be taken into account. In this paper we present novel hardware-friendly Tensor-Train decomposition implementation for Convolutional Neural Networks together with Tensor Yard - one-shot training algorithm which optimizes an order of decomposition of network layers. These ideas allow to accelerate ResNet models on Ascend 310 NPU devices without significant loss of accuracy. For example we accelerate ResNet-101 by 14.6% with drop by 0.5 of top-1 ImageNet accuracy.

* 10 pages, 5 figures, 1 table

Via

Access Paper or Ask Questions