Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Hugo Inzirillo

SigGate: Enhancing Recurrent Neural Networks with Signature-Based Gating Mechanisms

Feb 13, 2025

Rémi Genet, Hugo Inzirillo

Abstract:In this paper, we propose a novel approach that enhances recurrent neural networks (RNNs) by incorporating path signatures into their gating mechanisms. Our method modifies both Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) architectures by replacing their forget and reset gates, respectively, with learnable path signatures. These signatures, which capture the geometric features of the entire path history, provide a richer context for controlling information flow through the network's memory. This modification allows the networks to make memory decisions based on the full historical context rather than just the current input and state. Through experimental studies, we demonstrate that our Signature-LSTM (SigLSTM) and Signature-GRU (SigGRU) models outperform their traditional counterparts across various sequential learning tasks. By leveraging path signatures in recurrent architectures, this method offers new opportunities to enhance performance in time series analysis and forecasting applications.

Via

Access Paper or Ask Questions

STAN: Smooth Transition Autoregressive Networks

Jan 30, 2025

Hugo Inzirillo, Remi Genet

Figure 1 for STAN: Smooth Transition Autoregressive Networks

Figure 2 for STAN: Smooth Transition Autoregressive Networks

Figure 3 for STAN: Smooth Transition Autoregressive Networks

Figure 4 for STAN: Smooth Transition Autoregressive Networks

Abstract:Traditional Smooth Transition Autoregressive (STAR) models offer an effective way to model these dynamics through smooth regime changes based on specific transition variables. In this paper, we propose a novel approach by drawing an analogy between STAR models and a multilayer neural network architecture. Our proposed neural network architecture mimics the STAR framework, employing multiple layers to simulate the smooth transition between regimes and capturing complex, nonlinear relationships. The network's hidden layers and activation functions are structured to replicate the gradual switching behavior typical of STAR models, allowing for a more flexible and scalable approach to regime-dependent modeling. This research suggests that neural networks can provide a powerful alternative to STAR models, with the potential to enhance predictive accuracy in economic and financial forecasting.

Via

Access Paper or Ask Questions

Keras Sig: Efficient Path Signature Computation on GPU in Keras 3

Jan 14, 2025

Rémi Genet, Hugo Inzirillo

Figure 1 for Keras Sig: Efficient Path Signature Computation on GPU in Keras 3

Figure 2 for Keras Sig: Efficient Path Signature Computation on GPU in Keras 3

Figure 3 for Keras Sig: Efficient Path Signature Computation on GPU in Keras 3

Figure 4 for Keras Sig: Efficient Path Signature Computation on GPU in Keras 3

Abstract:In this paper we introduce Keras Sig a high-performance pythonic library designed to compute path signature for deep learning applications. Entirely built in Keras 3, \textit{Keras Sig} leverages the seamless integration with the mostly used deep learning backends such as PyTorch, JAX and TensorFlow. Inspired by Kidger and Lyons (2021),we proposed a novel approach reshaping signature calculations to leverage GPU parallelism. This adjustment allows us to reduce the training time by 55\% and 5 to 10-fold improvements in direct signature computation compared to existing methods, while maintaining similar CPU performance. Relying on high-level tensor operations instead of low-level C++ code, Keras Sig significantly reduces the versioning and compatibility issues commonly encountered in deep learning libraries, while delivering superior or comparable performance across various hardware configurations. We demonstrate through extensive benchmarking that our approach scales efficiently with the length of input sequences and maintains competitive performance across various signature parameters, though bounded by memory constraints for very large signature dimensions.

Via

Access Paper or Ask Questions

CaAdam: Improving Adam optimizer using connection aware methods

Oct 31, 2024

Remi Genet, Hugo Inzirillo

Figure 1 for CaAdam: Improving Adam optimizer using connection aware methods

Figure 2 for CaAdam: Improving Adam optimizer using connection aware methods

Figure 3 for CaAdam: Improving Adam optimizer using connection aware methods

Figure 4 for CaAdam: Improving Adam optimizer using connection aware methods

Abstract:We introduce a new method inspired by Adam that enhances convergence speed and achieves better loss function minima. Traditional optimizers, including Adam, apply uniform or globally adjusted learning rates across neural networks without considering their architectural specifics. This architecture-agnostic approach is deeply embedded in most deep learning frameworks, where optimizers are implemented as standalone modules without direct access to the network's structural information. For instance, in popular frameworks like Keras or PyTorch, optimizers operate solely on gradients and parameters, without knowledge of layer connectivity or network topology. Our algorithm, CaAdam, explores this overlooked area by introducing connection-aware optimization through carefully designed proxies of architectural information. We propose multiple scaling methodologies that dynamically adjust learning rates based on easily accessible structural properties such as layer depth, connection counts, and gradient distributions. This approach enables more granular optimization while working within the constraints of current deep learning frameworks. Empirical evaluations on standard datasets (e.g., CIFAR-10, Fashion MNIST) show that our method consistently achieves faster convergence and higher accuracy compared to standard Adam optimizer, demonstrating the potential benefits of incorporating architectural awareness in optimization strategies.

Via

Access Paper or Ask Questions

A Temporal Linear Network for Time Series Forecasting

Oct 28, 2024

Remi Genet, Hugo Inzirillo

Abstract:Recent research has challenged the necessity of complex deep learning architectures for time series forecasting, demonstrating that simple linear models can often outperform sophisticated approaches. Building upon this insight, we introduce a novel architecture the Temporal Linear Net (TLN), that extends the capabilities of linear models while maintaining interpretability and computational efficiency. TLN is designed to effectively capture both temporal and feature-wise dependencies in multivariate time series data. Our approach is a variant of TSMixer that maintains strict linearity throughout its architecture. TSMixer removes activation functions, introduces specialized kernel initializations, and incorporates dilated convolutions to handle various time scales, while preserving the linear nature of the model. Unlike transformer-based models that may lose temporal information due to their permutation-invariant nature, TLN explicitly preserves and leverages the temporal structure of the input data. A key innovation of TLN is its ability to compute an equivalent linear model, offering a level of interpretability not found in more complex architectures such as TSMixer. This feature allows for seamless conversion between the full TLN model and its linear equivalent, facilitating both training flexibility and inference optimization.

Via

Access Paper or Ask Questions

Deep State Space Recurrent Neural Networks for Time Series Forecasting

Jul 21, 2024

Hugo Inzirillo

Abstract:We explore various neural network architectures for modeling the dynamics of the cryptocurrency market. Traditional linear models often fall short in accurately capturing the unique and complex dynamics of this market. In contrast, Deep Neural Networks (DNNs) have demonstrated considerable proficiency in time series forecasting. This papers introduces novel neural network framework that blend the principles of econometric state space models with the dynamic capabilities of Recurrent Neural Networks (RNNs). We propose state space models using Long Short Term Memory (LSTM), Gated Residual Units (GRU) and Temporal Kolmogorov-Arnold Networks (TKANs). According to the results, TKANs, inspired by Kolmogorov-Arnold Networks (KANs) and LSTM, demonstrate promising outcomes.

Via

Access Paper or Ask Questions

SigKAN: Signature-Weighted Kolmogorov-Arnold Networks for Time Series

Jun 25, 2024

Hugo Inzirillo, Remi Genet

Figure 1 for SigKAN: Signature-Weighted Kolmogorov-Arnold Networks for Time Series

Figure 2 for SigKAN: Signature-Weighted Kolmogorov-Arnold Networks for Time Series

Figure 3 for SigKAN: Signature-Weighted Kolmogorov-Arnold Networks for Time Series

Figure 4 for SigKAN: Signature-Weighted Kolmogorov-Arnold Networks for Time Series

Abstract:We propose a novel approach that enhances multivariate function approximation using learnable path signatures and Kolmogorov-Arnold networks (KANs). We enhance the learning capabilities of these networks by weighting the values obtained by KANs using learnable path signatures, which capture important geometric features of paths. This combination allows for a more comprehensive and flexible representation of sequential and temporal data. We demonstrate through studies that our SigKANs with learnable path signatures perform better than conventional methods across a range of function approximation challenges. By leveraging path signatures in neural networks, this method offers intriguing opportunities to enhance performance in time series analysis and time series forecasting, among other fields.

* arXiv admin note: text overlap with arXiv:2405.07344, arXiv:2406.02486

Via

Access Paper or Ask Questions

A Temporal Kolmogorov-Arnold Transformer for Time Series Forecasting

Jun 04, 2024

Remi Genet, Hugo Inzirillo

Abstract:Capturing complex temporal patterns and relationships within multivariate data streams is a difficult task. We propose the Temporal Kolmogorov-Arnold Transformer (TKAT), a novel attention-based architecture designed to address this task using Temporal Kolmogorov-Arnold Networks (TKANs). Inspired by the Temporal Fusion Transformer (TFT), TKAT emerges as a powerful encoder-decoder model tailored to handle tasks in which the observed part of the features is more important than the a priori known part. This new architecture combined the theoretical foundation of the Kolmogorov-Arnold representation with the power of transformers. TKAT aims to simplify the complex dependencies inherent in time series, making them more "interpretable". The use of transformer architecture in this framework allows us to capture long-range dependencies through self-attention mechanisms.

Via

Access Paper or Ask Questions

TKAN: Temporal Kolmogorov-Arnold Networks

May 12, 2024

Remi Genet, Hugo Inzirillo

Abstract:Recurrent Neural Networks (RNNs) have revolutionized many areas of machine learning, particularly in natural language and data sequence processing. Long Short-Term Memory (LSTM) has demonstrated its ability to capture long-term dependencies in sequential data. Inspired by the Kolmogorov-Arnold Networks (KANs) a promising alternatives to Multi-Layer Perceptrons (MLPs), we proposed a new neural networks architecture inspired by KAN and the LSTM, the Temporal Kolomogorov-Arnold Networks (TKANs). TKANs combined the strenght of both networks, it is composed of Recurring Kolmogorov-Arnold Networks (RKANs) Layers embedding memory management. This innovation enables us to perform multi-step time series forecasting with enhanced accuracy and efficiency. By addressing the limitations of traditional models in handling complex sequential patterns, the TKAN architecture offers significant potential for advancements in fields requiring more than one step ahead forecasting.

Via

Access Paper or Ask Questions

An Attention Free Conditional Autoencoder For Anomaly Detection in Cryptocurrencies

Apr 20, 2023

Hugo Inzirillo, Ludovic De Villelongue

Abstract:It is difficult to identify anomalies in time series, especially when there is a lot of noise. Denoising techniques can remove the noise but this technique can cause a significant loss of information. To detect anomalies in the time series we have proposed an attention free conditional autoencoder (AF-CA). We started from the autoencoder conditional model on which we added an Attention-Free LSTM layer \cite{inzirillo2022attention} in order to make the anomaly detection capacity more reliable and to increase the power of anomaly detection. We compared the results of our Attention Free Conditional Autoencoder with those of an LSTM Autoencoder and clearly improved the explanatory power of the model and therefore the detection of anomaly in noisy time series.

Via

Access Paper or Ask Questions