Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ernest Fokoué

Rochester Institute of Technology - United States

No Intelligence Without Statistics: The Invisible Backbone of Artificial Intelligence

Oct 22, 2025

Ernest Fokoué

Abstract:The rapid ascent of artificial intelligence (AI) is often portrayed as a revolution born from computer science and engineering. This narrative, however, obscures a fundamental truth: the theoretical and methodological core of AI is, and has always been, statistical. This paper systematically argues that the field of statistics provides the indispensable foundation for machine learning and modern AI. We deconstruct AI into nine foundational pillars-Inference, Density Estimation, Sequential Learning, Generalization, Representation Learning, Interpretability, Causality, Optimization, and Unification-demonstrating that each is built upon century-old statistical principles. From the inferential frameworks of hypothesis testing and estimation that underpin model evaluation, to the density estimation roots of clustering and generative AI; from the time-series analysis inspiring recurrent networks to the causal models that promise true understanding, we trace an unbroken statistical lineage. While celebrating the computational engines that power modern AI, we contend that statistics provides the brain-the theoretical frameworks, uncertainty quantification, and inferential goals-while computer science provides the brawn-the scalable algorithms and hardware. Recognizing this statistical backbone is not merely an academic exercise, but a necessary step for developing more robust, interpretable, and trustworthy intelligent systems. We issue a call to action for education, research, and practice to re-embrace this statistical foundation. Ignoring these roots risks building a fragile future; embracing them is the path to truly intelligent machines. There is no machine learning without statistical learning; no artificial intelligence without statistical thought.

* 37 pages, 6 figures

Via

Access Paper or Ask Questions

Towards AI-Driven Policing: Interdisciplinary Knowledge Discovery from Police Body-Worn Camera Footage

Apr 28, 2025

Anita Srbinovska, Angela Srbinovska, Vivek Senthil, Adrian Martin, John McCluskey, Ernest Fokoué

Abstract:This paper proposes a novel interdisciplinary framework for analyzing police body-worn camera (BWC) footage from the Rochester Police Department (RPD) using advanced artificial intelligence (AI) and statistical machine learning (ML) techniques. Our goal is to detect, classify, and analyze patterns of interaction between police officers and civilians to identify key behavioral dynamics, such as respect, disrespect, escalation, and de-escalation. We apply multimodal data analysis by integrating video, audio, and natural language processing (NLP) techniques to extract meaningful insights from BWC footage. We present our methodology, computational techniques, and findings, outlining a practical approach for law enforcement while advancing the frontiers of knowledge discovery from police BWC data.

* 6 pages, 2 figures, and 1 table

Via

Access Paper or Ask Questions

Emerging Statistical Machine Learning Techniques for Extreme Temperature Forecasting in U.S. Cities

Jul 26, 2023

Kameron B. Kinast, Ernest Fokoué

Abstract:In this paper, we present a comprehensive analysis of extreme temperature patterns using emerging statistical machine learning techniques. Our research focuses on exploring and comparing the effectiveness of various statistical models for climate time series forecasting. The models considered include Auto-Regressive Integrated Moving Average, Exponential Smoothing, Multilayer Perceptrons, and Gaussian Processes. We apply these methods to climate time series data from five most populated U.S. cities, utilizing Python and Julia to demonstrate the role of statistical computing in understanding climate change and its impacts. Our findings highlight the differences between the statistical methods and identify Multilayer Perceptrons as the most effective approach. Additionally, we project extreme temperatures using this best-performing method, up to 2030, and examine whether the temperature changes are greater than zero, thereby testing a hypothesis.

* 13 pages, 4 large figures

Via

Access Paper or Ask Questions

Efficient Novelty Detection Methods for Early Warning of Potential Fatal Diseases

Aug 06, 2022

Sèdjro Salomon Hotegni, Ernest Fokoué

Figure 1 for Efficient Novelty Detection Methods for Early Warning of Potential Fatal Diseases

Figure 2 for Efficient Novelty Detection Methods for Early Warning of Potential Fatal Diseases

Figure 3 for Efficient Novelty Detection Methods for Early Warning of Potential Fatal Diseases

Figure 4 for Efficient Novelty Detection Methods for Early Warning of Potential Fatal Diseases

Abstract:Fatal diseases, as Critical Health Episodes (CHEs), represent real dangers for patients hospitalized in Intensive Care Units. These episodes can lead to irreversible organ damage and death. Nevertheless, diagnosing them in time would greatly reduce their inconvenience. This study therefore focused on building a highly effective early warning system for CHEs such as Acute Hypotensive Episodes and Tachycardia Episodes. To facilitate the precocity of the prediction, a gap of one hour was considered between the observation periods (Observation Windows) and the periods during which a critical event can occur (Target Windows). The MIMIC II dataset was used to evaluate the performance of the proposed system. This system first includes extracting additional features using three different modes. Then, the feature selection process allowing the selection of the most relevant features was performed using the Mutual Information Gain feature importance. Finally, the high-performance predictive model LightGBM was used to perform episode classification. This approach called MIG-LightGBM was evaluated using five different metrics: Event Recall (ER), Reduced Precision (RP), average Anticipation Time (aveAT), average False Alarms (aveFA), and Event F1-score (EF1-score). A method is therefore considered highly efficient for the early prediction of CHEs if it exhibits not only a large aveAT but also a large EF1-score and a low aveFA. Compared to systems using Extreme Gradient Boosting, Support Vector Classification or Naive Bayes as a predictive model, the proposed system was found to be highly dominant. It also confirmed its superiority over the Layered Learning approach.

* 12 pages, 3 figures

Via

Access Paper or Ask Questions

Boosting the Predictive Accurary of Singer Identification Using Discrete Wavelet Transform For Feature Extraction

Jan 31, 2021

Victoire Djimna Noyum, Younous Perieukeu Mofenjou, Cyrille Feudjio, Alkan Göktug, Ernest Fokoué

Figure 1 for Boosting the Predictive Accurary of Singer Identification Using Discrete Wavelet Transform For Feature Extraction

Figure 2 for Boosting the Predictive Accurary of Singer Identification Using Discrete Wavelet Transform For Feature Extraction

Figure 3 for Boosting the Predictive Accurary of Singer Identification Using Discrete Wavelet Transform For Feature Extraction

Figure 4 for Boosting the Predictive Accurary of Singer Identification Using Discrete Wavelet Transform For Feature Extraction

Abstract:Facing the diversity and growth of the musical field nowadays, the search for precise songs becomes more and more complex. The identity of the singer facilitates this search. In this project, we focus on the problem of identifying the singer by using different methods for feature extraction. Particularly, we introduce the Discrete Wavelet Transform (DWT) for this purpose. To the best of our knowledge, DWT has never been used this way before in the context of singer identification. This process consists of three crucial parts. First, the vocal signal is separated from the background music by using the Robust Principal Component Analysis (RPCA). Second, features from the obtained vocal signal are extracted. Here, the goal is to study the performance of the Discrete Wavelet Transform (DWT) in comparison to the Mel Frequency Cepstral Coefficient (MFCC) which is the most used technique in audio signals. Finally, we proceed with the identification of the singer where two methods have experimented: the Support Vector Machine (SVM), and the Gaussian Mixture Model (GMM). We conclude that, for a dataset of 4 singers and 200 songs, the best identification system consists of the DWT (db4) feature extraction introduced in this work combined with a linear support vector machine for identification resulting in a mean accuracy of 83.96%.

Via

Access Paper or Ask Questions

A Novel Use of Discrete Wavelet Transform Features in the Prediction of Epileptic Seizures from EEG Data

Jan 31, 2021

Cyrille Feudjio, Victoire Djimna Noyum, Younous Perieukeu Mofendjou, Rockefeller, Ernest Fokoué

Figure 1 for A Novel Use of Discrete Wavelet Transform Features in the Prediction of Epileptic Seizures from EEG Data

Figure 2 for A Novel Use of Discrete Wavelet Transform Features in the Prediction of Epileptic Seizures from EEG Data

Figure 3 for A Novel Use of Discrete Wavelet Transform Features in the Prediction of Epileptic Seizures from EEG Data

Figure 4 for A Novel Use of Discrete Wavelet Transform Features in the Prediction of Epileptic Seizures from EEG Data

Abstract:This paper demonstrates the predictive superiority of discrete wavelet transform (DWT) over previously used methods of feature extraction in the diagnosis of epileptic seizures from EEG data. Classification accuracy, specificity, and sensitivity are used as evaluation metrics. We specifically show the immense potential of 2 combinations (DWT-db4 combined with SVM and DWT-db2 combined with RF) as compared to others when it comes to diagnosing epileptic seizures either in the balanced or the imbalanced dataset. The results also highlight that MFCC performs less than all the DWT used in this study and that, The mean-differences are statistically significant respectively in the imbalanced and balanced dataset. Finally, either in the balanced or the imbalanced dataset, the feature extraction techniques, the models, and the interaction between them have a statistically significant effect on the classification accuracy.

Via

Access Paper or Ask Questions

Nonnegative Matrix Factorization with Zellner Penalty

Dec 07, 2020

Matthew Corsetti, Ernest Fokoué

Figure 1 for Nonnegative Matrix Factorization with Zellner Penalty

Figure 2 for Nonnegative Matrix Factorization with Zellner Penalty

Figure 3 for Nonnegative Matrix Factorization with Zellner Penalty

Figure 4 for Nonnegative Matrix Factorization with Zellner Penalty

Abstract:Nonnegative matrix factorization (NMF) is a relatively new unsupervised learning algorithm that decomposes a nonnegative data matrix into a parts-based, lower dimensional, linear representation of the data. NMF has applications in image processing, text mining, recommendation systems and a variety of other fields. Since its inception, the NMF algorithm has been modified and explored by numerous authors. One such modification involves the addition of auxiliary constraints to the objective function of the factorization. The purpose of these auxiliary constraints is to impose task-specific penalties or restrictions on the objective function. Though many auxiliary constraints have been studied, none have made use of data-dependent penalties. In this paper, we propose Zellner nonnegative matrix factorization (ZNMF), which uses data-dependent auxiliary constraints. We assess the facial recognition performance of the ZNMF algorithm and several other well-known constrained NMF algorithms using the Cambridge ORL database.

* Open Journal of Statistics 5 (2015) 777-786
* 10 pages, 4 figures, 2 tables

Via

Access Paper or Ask Questions

Nonnegative Matrix Factorization with Toeplitz Penalty

Dec 07, 2020

Matthew Corsetti, Ernest Fokoué

Figure 1 for Nonnegative Matrix Factorization with Toeplitz Penalty

Figure 2 for Nonnegative Matrix Factorization with Toeplitz Penalty

Figure 3 for Nonnegative Matrix Factorization with Toeplitz Penalty

Figure 4 for Nonnegative Matrix Factorization with Toeplitz Penalty

Abstract:Nonnegative Matrix Factorization (NMF) is an unsupervised learning algorithm that produces a linear, parts-based approximation of a data matrix. NMF constructs a nonnegative low rank basis matrix and a nonnegative low rank matrix of weights which, when multiplied together, approximate the data matrix of interest using some cost function. The NMF algorithm can be modified to include auxiliary constraints which impose task-specific penalties or restrictions on the cost function of the matrix factorization. In this paper we propose a new NMF algorithm that makes use of non-data dependent auxiliary constraints which incorporate a Toeplitz matrix into the multiplicative updating of the basis and weight matrices. We compare the facial recognition performance of our new Toeplitz Nonnegative Matrix Factorization (TNMF) algorithm to the performance of the Zellner Nonnegative Matrix Factorization (ZNMF) algorithm which makes use of data-dependent auxiliary constraints. We also compare the facial recognition performance of the two aforementioned algorithms with the performance of several preexisting constrained NMF algorithms that have non-data-dependent penalties. The facial recognition performances are evaluated using the Cambridge ORL Database of Faces and the Yale Database of Faces.

* Journal.of.Informatics.and.Mathematical.Sciences 10 (2018) 201-215
* 15 pages, 6 figures, 3 tables

Via

Access Paper or Ask Questions

What do Asian Religions Have in Common? An Unsupervised Text Analytics Exploration

Dec 20, 2019

Preeti Sah, Ernest Fokoué

Figure 1 for What do Asian Religions Have in Common? An Unsupervised Text Analytics Exploration

Figure 2 for What do Asian Religions Have in Common? An Unsupervised Text Analytics Exploration

Figure 3 for What do Asian Religions Have in Common? An Unsupervised Text Analytics Exploration

Figure 4 for What do Asian Religions Have in Common? An Unsupervised Text Analytics Exploration

Abstract:The main source of various religious teachings is their sacred texts which vary from religion to religion based on different factors like the geographical location or time of the birth of a particular religion. Despite these differences, there could be similarities between the sacred texts based on what lessons it teaches to its followers. This paper attempts to find the similarity using text mining techniques. The corpus consisting of Asian (Tao Te Ching, Buddhism, Yogasutra, Upanishad) and non-Asian (four Bible texts) is used to explore findings of similarity measures like Euclidean, Manhattan, Jaccard and Cosine on raw Document Term Frequency [DTM], normalized DTM which reveals similarity based on word usage. The performance of Supervised learning algorithms like K-Nearest Neighbor [KNN], Support Vector Machine [SVM] and Random Forest is measured based on its accuracy to predict correct scared text for any given chapter in the corpus. The K-means clustering visualizations on Euclidean distances of raw DTM reveals that there exists a pattern of similarity among these sacred texts with Upanishads and Tao Te Ching is the most similar text in the corpus.

* 18 pages, 22 figures

Via

Access Paper or Ask Questions

Multi-Stage Fault Warning for Large Electric Grids Using Anomaly Detection and Machine Learning

Mar 15, 2019

Sanjeev Raja, Ernest Fokoué

Figure 1 for Multi-Stage Fault Warning for Large Electric Grids Using Anomaly Detection and Machine Learning

Figure 2 for Multi-Stage Fault Warning for Large Electric Grids Using Anomaly Detection and Machine Learning

Figure 3 for Multi-Stage Fault Warning for Large Electric Grids Using Anomaly Detection and Machine Learning

Figure 4 for Multi-Stage Fault Warning for Large Electric Grids Using Anomaly Detection and Machine Learning

Abstract:In the monitoring of a complex electric grid, it is of paramount importance to provide operators with early warnings of anomalies detected on the network, along with a precise classification and diagnosis of the specific fault type. In this paper, we propose a novel multi-stage early warning system prototype for electric grid fault detection, classification, subgroup discovery, and visualization. In the first stage, a computationally efficient anomaly detection method based on quartiles detects the presence of a fault in real time. In the second stage, the fault is classified into one of nine pre-defined disaster scenarios. The time series data are first mapped to highly discriminative features by applying dimensionality reduction based on temporal autocorrelation. The features are then mapped through one of three classification techniques: support vector machine, random forest, and artificial neural network. Finally in the third stage, intra-class clustering based on dynamic time warping is used to characterize the fault with further granularity. Results on the Bonneville Power Administration electric grid data show that i) the proposed anomaly detector is both fast and accurate; ii) dimensionality reduction leads to dramatic improvement in classification accuracy and speed; iii) the random forest method offers the most accurate, consistent, and robust fault classification; and iv) time series within a given class naturally separate into five distinct clusters which correspond closely to the geographical distribution of electric grid buses.

* 13 pages, 14 figures

Via

Access Paper or Ask Questions