Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Mohammed F. Tolba

C3PU: Cross-Coupling Capacitor Processing Unit Using Analog-Mixed Signal In-Memory Computing for AI Inference

Oct 11, 2021

Dima Kilani, Baker Mohammad, Yasmin Halawani, Mohammed F. Tolba, Hani Saleh

Figure 1 for C3PU: Cross-Coupling Capacitor Processing Unit Using Analog-Mixed Signal In-Memory Computing for AI Inference

Figure 2 for C3PU: Cross-Coupling Capacitor Processing Unit Using Analog-Mixed Signal In-Memory Computing for AI Inference

Figure 3 for C3PU: Cross-Coupling Capacitor Processing Unit Using Analog-Mixed Signal In-Memory Computing for AI Inference

Figure 4 for C3PU: Cross-Coupling Capacitor Processing Unit Using Analog-Mixed Signal In-Memory Computing for AI Inference

Abstract:This paper presents a novel cross-coupling capacitor processing unit (C3PU) that supports analog-mixed signal in memory computing to perform multiply-and-accumulate (MAC) operations. The C3PU consists of a capacitive unit, a CMOS transistor, and a voltage-to-time converter (VTC). The capacitive unit serves as a computational element that holds the multiplier operand and performs multiplication once the multiplicand is applied at the terminal. The multiplicand is the input voltage that is converted to a pulse width signal using a low power VTC. The transistor transfers this multiplication where a voltage level is generated. A demonstrator of 5x4 C3PU array that is capable of implementing 4 MAC units is presented. The design has been verified using Monte Carlo simulation in 65 nm technology. The 5x4 C3PU consumed energy of 66.4 fJ/MAC at 0.3 V voltage supply with an error of 5.7%. The proposed unit achieves lower energy and occupies a smaller area by 3.4x and 3.6x, respectively, with similar error value when compared to a digital-based 8x4-bit fixed point MAC unit. The C3PU has been utilized through an iris fower classification utilizing an artificial neural network which achieved a 90% classification accuracy compared to ideal accuracy of 96.67% using MATLAB.

* 10 pages, 12 figures and 7 tables

Via

Access Paper or Ask Questions

Deep Neural Networks Based Weight Approximation and Computation Reuse for 2-D Image Classification

Apr 28, 2021

Mohammed F. Tolba, Huruy Tekle Tesfai, Hani Saleh, Baker Mohammad, Mahmoud Al-Qutayri

Figure 1 for Deep Neural Networks Based Weight Approximation and Computation Reuse for 2-D Image Classification

Figure 2 for Deep Neural Networks Based Weight Approximation and Computation Reuse for 2-D Image Classification

Figure 3 for Deep Neural Networks Based Weight Approximation and Computation Reuse for 2-D Image Classification

Figure 4 for Deep Neural Networks Based Weight Approximation and Computation Reuse for 2-D Image Classification

Abstract:Deep Neural Networks (DNNs) are computationally and memory intensive, which makes their hardware implementation a challenging task especially for resource constrained devices such as IoT nodes. To address this challenge, this paper introduces a new method to improve DNNs performance by fusing approximate computing with data reuse techniques to be used for image recognition applications. DNNs weights are approximated based on the linear and quadratic approximation methods during the training phase, then, all of the weights are replaced with the linear/quadratic coefficients to execute the inference in a way where different weights could be computed using the same coefficients. This leads to a repetition of the weights across the processing element (PE) array, which in turn enables the reuse of the DNN sub-computations (computational reuse) and leverage the same data (data reuse) to reduce DNNs computations, memory accesses, and improve energy efficiency albeit at the cost of increased training time. Complete analysis for both MNIST and CIFAR 10 datasets is presented for image recognition , where LeNet 5 revealed a reduction in the number of parameters by a factor of 1211.3x with a drop of less than 0.9% in accuracy. When compared to the state of the art Row Stationary (RS) method, the proposed architecture saved 54% of the total number of adders and multipliers needed. Overall, the proposed approach is suitable for IoT edge devices as it reduces the memory size requirement as well as the number of needed memory accesses.

* 10 pages 9 figures

Via

Access Paper or Ask Questions

Fundamental Matrix Estimation: A Study of Error Criteria

Jun 24, 2017

Mohammed E. Fathy, Ashraf S. Hussein, Mohammed F. Tolba

Figure 1 for Fundamental Matrix Estimation: A Study of Error Criteria

Figure 2 for Fundamental Matrix Estimation: A Study of Error Criteria

Figure 3 for Fundamental Matrix Estimation: A Study of Error Criteria

Figure 4 for Fundamental Matrix Estimation: A Study of Error Criteria

Abstract:The fundamental matrix (FM) describes the geometric relations that exist between two images of the same scene. Different error criteria are used for estimating FMs from an input set of correspondences. In this paper, the accuracy and efficiency aspects of the different error criteria were studied. We mathematically and experimentally proved that the most popular error criterion, the symmetric epipolar distance, is biased. It was also shown that despite the similarity between the algebraic expressions of the symmetric epipolar distance and Sampson distance, they have different accuracy properties. In addition, a new error criterion, Kanatani distance, was proposed and was proved to be the most effective for use during the outlier removal phase from accuracy and efficiency perspectives. To thoroughly test the accuracy of the different error criteria, we proposed a randomized algorithm for Reprojection Error-based Correspondence Generation (RE-CG). As input, RE-CG takes an FM and a desired reprojection error value $d$. As output, RE-CG generates a random correspondence having that error value. Mathematical analysis of this algorithm revealed that the success probability for any given trial is 1 - (2/3)^2 at best and is 1 - (6/7)^2 at worst while experiments demonstrated that the algorithm often succeeds after only one trial.

* 15 pages, 7 figures, Pattern Recognition Letters, 2011

Via

Access Paper or Ask Questions