Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Arman Bolatov

Gradient Descent Fails to Learn High-frequency Functions and Modular Arithmetic

Oct 19, 2023

Rustem Takhanov, Maxat Tezekbayev, Artur Pak, Arman Bolatov, Zhenisbek Assylbekov

Figure 1 for Gradient Descent Fails to Learn High-frequency Functions and Modular Arithmetic

Figure 2 for Gradient Descent Fails to Learn High-frequency Functions and Modular Arithmetic

Figure 3 for Gradient Descent Fails to Learn High-frequency Functions and Modular Arithmetic

Figure 4 for Gradient Descent Fails to Learn High-frequency Functions and Modular Arithmetic

Abstract:Classes of target functions containing a large number of approximately orthogonal elements are known to be hard to learn by the Statistical Query algorithms. Recently this classical fact re-emerged in a theory of gradient-based optimization of neural networks. In the novel framework, the hardness of a class is usually quantified by the variance of the gradient with respect to a random choice of a target function. A set of functions of the form $x\to ax \bmod p$, where $a$ is taken from ${\mathbb Z}_p$, has attracted some attention from deep learning theorists and cryptographers recently. This class can be understood as a subset of $p$-periodic functions on ${\mathbb Z}$ and is tightly connected with a class of high-frequency periodic functions on the real line. We present a mathematical analysis of limitations and challenges associated with using gradient-based learning techniques to train a high-frequency periodic function or modular multiplication from examples. We highlight that the variance of the gradient is negligibly small in both cases when either a frequency or the prime base $p$ is large. This in turn prevents such a learning algorithm from being successful.

Via

Access Paper or Ask Questions

Intractability of Learning the Discrete Logarithm with Gradient-Based Methods

Oct 02, 2023

Rustem Takhanov, Maxat Tezekbayev, Artur Pak, Arman Bolatov, Zhibek Kadyrsizova, Zhenisbek Assylbekov

Figure 1 for Intractability of Learning the Discrete Logarithm with Gradient-Based Methods

Figure 2 for Intractability of Learning the Discrete Logarithm with Gradient-Based Methods

Figure 3 for Intractability of Learning the Discrete Logarithm with Gradient-Based Methods

Figure 4 for Intractability of Learning the Discrete Logarithm with Gradient-Based Methods

Abstract:The discrete logarithm problem is a fundamental challenge in number theory with significant implications for cryptographic protocols. In this paper, we investigate the limitations of gradient-based methods for learning the parity bit of the discrete logarithm in finite cyclic groups of prime order. Our main result, supported by theoretical analysis and empirical verification, reveals the concentration of the gradient of the loss function around a fixed point, independent of the logarithm's base used. This concentration property leads to a restricted ability to learn the parity bit efficiently using gradient-based methods, irrespective of the complexity of the network architecture being trained. Our proof relies on Boas-Bellman inequality in inner product spaces and it involves establishing approximate orthogonality of discrete logarithm's parity bit functions through the spectral norm of certain matrices. Empirical experiments using a neural network-based approach further verify the limitations of gradient-based learning, demonstrating the decreasing success rate in predicting the parity bit as the group order increases.

* ACML 2023

Via

Access Paper or Ask Questions

Long-Tail Theory under Gaussian Mixtures

Jul 24, 2023

Arman Bolatov, Maxat Tezekbayev, Igor Melnykov, Artur Pak, Vassilina Nikoulina, Zhenisbek Assylbekov

Figure 1 for Long-Tail Theory under Gaussian Mixtures

Figure 2 for Long-Tail Theory under Gaussian Mixtures

Figure 3 for Long-Tail Theory under Gaussian Mixtures

Figure 4 for Long-Tail Theory under Gaussian Mixtures

Abstract:We suggest a simple Gaussian mixture model for data generation that complies with Feldman's long tail theory (2020). We demonstrate that a linear classifier cannot decrease the generalization error below a certain level in the proposed model, whereas a nonlinear classifier with a memorization capacity can. This confirms that for long-tailed distributions, rare training examples must be considered for optimal generalization to new data. Finally, we show that the performance gap between linear and nonlinear models can be lessened as the tail becomes shorter in the subpopulation frequency distribution, as confirmed by experiments on synthetic and real data.

* accepted to ECAI 2023

Via

Access Paper or Ask Questions

A Central Asian Food Dataset for Personalized Dietary Interventions, Extended Abstract

May 12, 2023

Aknur Karabay, Arman Bolatov, Huseyin Atakan Varol, Mei-Yen Chan

Abstract:Nowadays, it is common for people to take photographs of every beverage, snack, or meal they eat and then post these photographs on social media platforms. Leveraging these social trends, real-time food recognition and reliable classification of these captured food images can potentially help replace some of the tedious recording and coding of food diaries to enable personalized dietary interventions. Although Central Asian cuisine is culturally and historically distinct, there has been little published data on the food and dietary habits of people in this region. To fill this gap, we aim to create a reliable dataset of regional foods that is easily accessible to both public consumers and researchers. To the best of our knowledge, this is the first work on creating a Central Asian Food Dataset (CAFD). The final dataset contains 42 food categories and over 16,000 images of national dishes unique to this region. We achieved a classification accuracy of 88.70\% (42 classes) on the CAFD using the ResNet152 neural network model. The food recognition models trained on the CAFD demonstrate computer vision's effectiveness and high accuracy for dietary assessment.

* 3 pages, 2 figures, 5 tables

Via

Access Paper or Ask Questions

Empirical Analysis of the AdaBoost's Error Bound

Feb 02, 2023

Arman Bolatov, Kaisar Dauletbek

Figure 1 for Empirical Analysis of the AdaBoost's Error Bound

Figure 2 for Empirical Analysis of the AdaBoost's Error Bound

Figure 3 for Empirical Analysis of the AdaBoost's Error Bound

Figure 4 for Empirical Analysis of the AdaBoost's Error Bound

Abstract:Understanding the accuracy limits of machine learning algorithms is essential for data scientists to properly measure performance so they can continually improve their models' predictive capabilities. This study empirically verified the error bound of the AdaBoost algorithm for both synthetic and real-world data. The results show that the error bound holds up in practice, demonstrating its efficiency and importance to a variety of applications. The corresponding source code is available at https://github.com/armanbolatov/adaboost_error_bound.

* 4 pages, 4 figures

Via

Access Paper or Ask Questions