Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Topic:lstm

Quantum Reinforcement Learning Trading Agent for Sector Rotation in the Taiwan Stock Market

Jun 26, 2025

Chi-Sheng Chen, Xinyu Zhang, Ya-Chuan Chen

Abstract:We propose a hybrid quantum-classical reinforcement learning framework for sector rotation in the Taiwan stock market. Our system employs Proximal Policy Optimization (PPO) as the backbone algorithm and integrates both classical architectures (LSTM, Transformer) and quantum-enhanced models (QNN, QRWKV, QASA) as policy and value networks. An automated feature engineering pipeline extracts financial indicators from capital share data to ensure consistent model input across all configurations. Empirical backtesting reveals a key finding: although quantum-enhanced models consistently achieve higher training rewards, they underperform classical models in real-world investment metrics such as cumulative return and Sharpe ratio. This discrepancy highlights a core challenge in applying reinforcement learning to financial domains -- namely, the mismatch between proxy reward signals and true investment objectives. Our analysis suggests that current reward designs may incentivize overfitting to short-term volatility rather than optimizing risk-adjusted returns. This issue is compounded by the inherent expressiveness and optimization instability of quantum circuits under Noisy Intermediate-Scale Quantum (NISQ) constraints. We discuss the implications of this reward-performance gap and propose directions for future improvement, including reward shaping, model regularization, and validation-based early stopping. Our work offers a reproducible benchmark and critical insights into the practical challenges of deploying quantum reinforcement learning in real-world finance.

Via

Access Paper or Ask Questions

FINN-GL: Generalized Mixed-Precision Extensions for FPGA-Accelerated LSTMs

Jun 25, 2025

Shashwat Khandelwal, Jakoba Petri-Koenig, Thomas B. Preußer, Michaela Blott, Shreejith Shanker

Abstract:Recurrent neural networks (RNNs), particularly LSTMs, are effective for time-series tasks like sentiment analysis and short-term stock prediction. However, their computational complexity poses challenges for real-time deployment in resource constrained environments. While FPGAs offer a promising platform for energy-efficient AI acceleration, existing tools mainly target feed-forward networks, and LSTM acceleration typically requires full custom implementation. In this paper, we address this gap by leveraging the open-source and extensible FINN framework to enable the generalized deployment of LSTMs on FPGAs. Specifically, we leverage the Scan operator from the Open Neural Network Exchange (ONNX) specification to model the recurrent nature of LSTM computations, enabling support for mixed quantisation within them and functional verification of LSTM-based models. Furthermore, we introduce custom transformations within the FINN compiler to map the quantised ONNX computation graph to hardware blocks from the HLS kernel library of the FINN compiler and Vitis HLS. We validate the proposed tool-flow by training a quantised ConvLSTM model for a mid-price stock prediction task using the widely used dataset and generating a corresponding hardware IP of the model using our flow, targeting the XCZU7EV device. We show that the generated quantised ConvLSTM accelerator through our flow achieves a balance between performance (latency) and resource consumption, while matching (or bettering) inference accuracy of state-of-the-art models with reduced precision. We believe that the generalisable nature of the proposed flow will pave the way for resource-efficient RNN accelerator designs on FPGAs.

* 9 pages, 6 figures, 5 tables, Accepted for publication in IEEE FPL-2025 (https://2025.fpl.org/)

Via

Access Paper or Ask Questions

Show, Tell and Summarize: Dense Video Captioning Using Visual Cue Aided Sentence Summarization

Jun 25, 2025

Zhiwang Zhang, Dong Xu, Wanli Ouyang, Chuanqi Tan

Abstract:In this work, we propose a division-and-summarization (DaS) framework for dense video captioning. After partitioning each untrimmed long video as multiple event proposals, where each event proposal consists of a set of short video segments, we extract visual feature (e.g., C3D feature) from each segment and use the existing image/video captioning approach to generate one sentence description for this segment. Considering that the generated sentences contain rich semantic descriptions about the whole event proposal, we formulate the dense video captioning task as a visual cue aided sentence summarization problem and propose a new two stage Long Short Term Memory (LSTM) approach equipped with a new hierarchical attention mechanism to summarize all generated sentences as one descriptive sentence with the aid of visual features. Specifically, the first-stage LSTM network takes all semantic words from the generated sentences and the visual features from all segments within one event proposal as the input, and acts as the encoder to effectively summarize both semantic and visual information related to this event proposal. The second-stage LSTM network takes the output from the first-stage LSTM network and the visual features from all video segments within one event proposal as the input, and acts as the decoder to generate one descriptive sentence for this event proposal. Our comprehensive experiments on the ActivityNet Captions dataset demonstrate the effectiveness of our newly proposed DaS framework for dense video captioning.

* 10 pages

Via

Access Paper or Ask Questions

Dense Video Captioning using Graph-based Sentence Summarization

Jun 25, 2025

Zhiwang Zhang, Dong Xu, Wanli Ouyang, Luping Zhou

Abstract:Recently, dense video captioning has made attractive progress in detecting and captioning all events in a long untrimmed video. Despite promising results were achieved, most existing methods do not sufficiently explore the scene evolution within an event temporal proposal for captioning, and therefore perform less satisfactorily when the scenes and objects change over a relatively long proposal. To address this problem, we propose a graph-based partition-and-summarization (GPaS) framework for dense video captioning within two stages. For the ``partition" stage, a whole event proposal is split into short video segments for captioning at a finer level. For the ``summarization" stage, the generated sentences carrying rich description information for each segment are summarized into one sentence to describe the whole event. We particularly focus on the ``summarization" stage, and propose a framework that effectively exploits the relationship between semantic words for summarization. We achieve this goal by treating semantic words as nodes in a graph and learning their interactions by coupling Graph Convolutional Network (GCN) and Long Short Term Memory (LSTM), with the aid of visual cues. Two schemes of GCN-LSTM Interaction (GLI) modules are proposed for seamless integration of GCN and LSTM. The effectiveness of our approach is demonstrated via an extensive comparison with the state-of-the-arts methods on the two benchmarks ActivityNet Captions dataset and YouCook II dataset.

* 12 pages

Via

Access Paper or Ask Questions

Efficacy of Temporal Fusion Transformers for Runoff Simulation

Jun 25, 2025

Sinan Rasiya Koya, Tirthankar Roy

Abstract:Combining attention with recurrence has shown to be valuable in sequence modeling, including hydrological predictions. Here, we explore the strength of Temporal Fusion Transformers (TFTs) over Long Short-Term Memory (LSTM) networks in rainfall-runoff modeling. We train ten randomly initialized models, TFT and LSTM, for 531 CAMELS catchments in the US. We repeat the experiment with five subsets of the Caravan dataset, each representing catchments in the US, Australia, Brazil, Great Britain, and Chile. Then, the performance of the models, their variability regarding the catchment attributes, and the difference according to the datasets are assessed. Our findings show that TFT slightly outperforms LSTM, especially in simulating the midsection and peak of hydrographs. Furthermore, we show the ability of TFT to handle longer sequences and why it can be a better candidate for higher or larger catchments. Being an explainable AI technique, TFT identifies the key dynamic and static variables, providing valuable scientific insights. However, both TFT and LSTM exhibit a considerable drop in performance with the Caravan dataset, indicating possible data quality issues. Overall, the study highlights the potential of TFT in improving hydrological modeling and understanding.

Via

Access Paper or Ask Questions

Emotion Detection on User Front-Facing App Interfaces for Enhanced Schedule Optimization: A Machine Learning Approach

Jun 24, 2025

Feiting Yang, Antoine Moevus, Steve Lévesque

Abstract:Human-Computer Interaction (HCI) has evolved significantly to incorporate emotion recognition capabilities, creating unprecedented opportunities for adaptive and personalized user experiences. This paper explores the integration of emotion detection into calendar applications, enabling user interfaces to dynamically respond to users' emotional states and stress levels, thereby enhancing both productivity and engagement. We present and evaluate two complementary approaches to emotion detection: a biometric-based method utilizing heart rate (HR) data extracted from electrocardiogram (ECG) signals processed through Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) neural networks to predict the emotional dimensions of Valence, Arousal, and Dominance; and a behavioral method analyzing computer activity through multiple machine learning models to classify emotions based on fine-grained user interactions such as mouse movements, clicks, and keystroke patterns. Our comparative analysis, from real-world datasets, reveals that while both approaches demonstrate effectiveness, the computer activity-based method delivers superior consistency and accuracy, particularly for mouse-related interactions, which achieved approximately 90\% accuracy. Furthermore, GRU networks outperformed LSTM models in the biometric approach, with Valence prediction reaching 84.38\% accuracy.

Via

Access Paper or Ask Questions

Fiber Signal Denoising Algorithm using Hybrid Deep Learning Networks

Jun 18, 2025

Linlin Wang, Wei Wang, Dezhao Wang, Shanwen Wang

Abstract:With the applicability of optical fiber-based distributed acoustic sensing (DAS) systems, effective signal processing and analysis approaches are needed to promote its popularization in the field of intelligent transportation systems (ITS). This paper presents a signal denoising algorithm using a hybrid deep-learning network (HDLNet). Without annotated data and time-consuming labeling, this self-supervised network runs in parallel, combining an autoencoder for denoising (DAE) and a long short-term memory (LSTM) for sequential processing. Additionally, a line-by-line matching algorithm for vehicle detection and tracking is introduced, thus realizing the complete processing of fiber signal denoising and feature extraction. Experiments were carried out on a self-established real highway tunnel dataset, showing that our proposed hybrid network yields more satisfactory denoising performance than Spatial-domain DAE.

* 15 pages, 10 figures

Via

Access Paper or Ask Questions

HiLight: A Hierarchical Reinforcement Learning Framework with Global Adversarial Guidance for Large-Scale Traffic Signal Control

Jun 17, 2025

Yaqiao Zhu, Hongkai Wen, Geyong Min, Man Luo

Abstract:Efficient traffic signal control (TSC) is essential for mitigating urban congestion, yet existing reinforcement learning (RL) methods face challenges in scaling to large networks while maintaining global coordination. Centralized RL suffers from scalability issues, while decentralized approaches often lack unified objectives, resulting in limited network-level efficiency. In this paper, we propose HiLight, a hierarchical reinforcement learning framework with global adversarial guidance for large-scale TSC. HiLight consists of a high-level Meta-Policy, which partitions the traffic network into subregions and generates sub-goals using a Transformer-LSTM architecture, and a low-level Sub-Policy, which controls individual intersections with global awareness. To improve the alignment between global planning and local execution, we introduce an adversarial training mechanism, where the Meta-Policy generates challenging yet informative sub-goals, and the Sub-Policy learns to surpass these targets, leading to more effective coordination. We evaluate HiLight across both synthetic and real-world benchmarks, and additionally construct a large-scale Manhattan network with diverse traffic conditions, including peak transitions, adverse weather, and holiday surges. Experimental results show that HiLight exhibits significant advantages in large-scale scenarios and remains competitive across standard benchmarks of varying sizes.

Via

Access Paper or Ask Questions

Between Predictability and Randomness: Seeking Artistic Inspiration from AI Generative Models

Jun 14, 2025

Olga Vechtomova

Abstract:Artistic inspiration often emerges from language that is open to interpretation. This paper explores the use of AI-generated poetic lines as stimuli for creativity. Through analysis of two generative AI approaches--lines generated by Long Short-Term Memory Variational Autoencoders (LSTM-VAE) and complete poems by Large Language Models (LLMs)--I demonstrate that LSTM-VAE lines achieve their evocative impact through a combination of resonant imagery and productive indeterminacy. While LLMs produce technically accomplished poetry with conventional patterns, LSTM-VAE lines can engage the artist through semantic openness, unconventional combinations, and fragments that resist closure. Through the composition of an original poem, where narrative emerged organically through engagement with LSTM-VAE generated lines rather than following a predetermined structure, I demonstrate how these characteristics can serve as evocative starting points for authentic artistic expression.

* Presented as a keynote at the 50th Linguistic Association of Canada and the United States (LACUS) conference in July 2024 and will be published in LACUS Forum 50

Via

Access Paper or Ask Questions

Optimizing Blood Transfusions and Predicting Shortages in Resource-Constrained Areas

Jun 14, 2025

El Arbi Belfarsi, Sophie Brubaker, Maria Valero

Abstract:Our research addresses the critical challenge of managing blood transfusions and optimizing allocation in resource-constrained regions. We present heuristic matching algorithms for donor-patient and blood bank selection, alongside machine learning methods to analyze blood transfusion acceptance data and predict potential shortages. We developed simulations to optimize blood bank operations, progressing from random allocation to a system incorporating proximity-based selection, blood type compatibility, expiration prioritization, and rarity scores. Moving from blind matching to a heuristic-based approach yielded a 28.6% marginal improvement in blood request acceptance, while a multi-level heuristic matching resulted in a 47.6% improvement. For shortage prediction, we compared Long Short-Term Memory (LSTM) networks, Linear Regression, and AutoRegressive Integrated Moving Average (ARIMA) models, trained on 170 days of historical data. Linear Regression slightly outperformed others with a 1.40% average absolute percentage difference in predictions. Our solution leverages a Cassandra NoSQL database, integrating heuristic optimization and shortage prediction to proactively manage blood resources. This scalable approach, designed for resource-constrained environments, considers factors such as proximity, blood type compatibility, inventory expiration, and rarity. Future developments will incorporate real-world data and additional variables to improve prediction accuracy and optimization performance.

* Proceedings of the 18th International Joint Conference on Biomedical Engineering Systems and Technologies 2025
* 12 pages, 9 figures, International Conference on Health Informatics

Via

Access Paper or Ask Questions

Topic:lstm

Papers and Code