Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jiexia Ye

Sparseformer: a Transferable Transformer with Multi-granularity Token Sparsification for Medical Time Series Classification

Mar 19, 2025

Jiexia Ye, Weiqi Zhang, Ziyue Li, Jia Li, Fugee Tsung

Abstract:Medical time series (MedTS) classification is crucial for improved diagnosis in healthcare, and yet it is challenging due to the varying granularity of patterns, intricate inter-channel correlation, information redundancy, and label scarcity. While existing transformer-based models have shown promise in time series analysis, they mainly focus on forecasting and fail to fully exploit the distinctive characteristics of MedTS data. In this paper, we introduce Sparseformer, a transformer specifically designed for MedTS classification. We propose a sparse token-based dual-attention mechanism that enables global modeling and token compression, allowing dynamic focus on the most informative tokens while distilling redundant features. This mechanism is then applied to the multi-granularity, cross-channel encoding of medical signals, capturing intra- and inter-granularity correlations and inter-channel connections. The sparsification design allows our model to handle heterogeneous inputs of varying lengths and channels directly. Further, we introduce an adaptive label encoder to address label space misalignment across datasets, equipping our model with cross-dataset transferability to alleviate the medical label scarcity issue. Our model outperforms 12 baselines across seven medical datasets under supervised learning. In the few-shot learning experiments, our model also achieves superior average results. In addition, the in-domain and cross-domain experiments among three diagnostic scenarios demonstrate our model's zero-shot learning capability. Collectively, these findings underscore the robustness and transferability of our model in various medical applications.

* 3 figures, 16 pages, 5 tables

Via

Access Paper or Ask Questions

DualTime: A Dual-Adapter Multimodal Language Model for Time Series Representation

Jun 07, 2024

Weiqi Zhang, Jiexia Ye, Ziyue Li, Jia Li, Fugee Tsung

Abstract:The recent rapid development of language models (LMs) has attracted attention in the field of time series, including multimodal time series modeling. However, we note that current time series multimodal methods are biased, often assigning a primary role to one modality while the other assumes a secondary role. They overlook the mutual benefits and complementary of different modalities. For example, in seizure diagnosis, relying solely on textual clinical reports makes it difficult to pinpoint the area and type of the disease, while electroencephalograms (EEGs) alone cannot provide an accurate diagnosis without considering the symptoms. In this study, based on the complementary information mining of time series multimodal data, we propose DualTime, a Dual-adapter multimodal language model for Time series representation implementing temporal-primary and textual-primary modeling simultaneously. By injecting lightweight adaption tokens, the LM pipeline shared by dual adapters encourages embedding alignment and achieves efficient fine-tuning. Empirically, our method outperforms state-of-the-art models in both supervised and unsupervised settings, highlighting the complementary benefits of different modalities. In addition, we conduct few-shot label transfer experiments, which further verifies the transferability and expressiveness of our proposed DualTime.

* 15 pages, 12 figure, 5 tables

Via

Access Paper or Ask Questions

A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model

May 07, 2024

Jiexia Ye, Weiqi Zhang, Ke Yi, Yongzi Yu, Ziyue Li, Jia Li, Fugee Tsung

Figure 1 for A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model

Figure 2 for A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model

Figure 3 for A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model

Figure 4 for A Survey of Time Series Foundation Models: Generalizing Time Series Representation with Large Language Model

Abstract:Time series data are ubiquitous across various domains, making time series analysis critically important. Traditional time series models are task-specific, featuring singular functionality and limited generalization capacity. Recently, large language foundation models have unveiled their remarkable capabilities for cross-task transferability, zero-shot/few-shot learning, and decision-making explainability. This success has sparked interest in the exploration of foundation models to solve multiple time series challenges simultaneously. There are two main research lines, namely pre-training foundation models from scratch for time series and adapting large language foundation models for time series. They both contribute to the development of a unified model that is highly generalizable, versatile, and comprehensible for time series analysis. This survey offers a 3E analytical framework for comprehensive examination of related research. Specifically, we examine existing works from three dimensions, namely Effectiveness, Efficiency and Explainability. In each dimension, we focus on discussing how related works devise tailored solution by considering unique challenges in the realm of time series. Furthermore, we provide a domain taxonomy to help followers keep up with the domain-specific advancements. In addition, we introduce extensive resources to facilitate the field's development, including datasets, open-source, time series libraries. A GitHub repository is also maintained for resource updates (https://github.com/start2020/Awesome-TimeSeries-LLM-FM).

* 5 figures, 6 tables, 41 pages

Via

Access Paper or Ask Questions

Multi-View TRGRU: Transformer based Spatiotemporal Model for Short-Term Metro Origin-Destination Matrix Prediction

Aug 16, 2021

Jiexia Ye, Furong Zheng, Juanjuan Zhao, Kejiang Ye, Chengzhong Xu

Figure 1 for Multi-View TRGRU: Transformer based Spatiotemporal Model for Short-Term Metro Origin-Destination Matrix Prediction

Figure 2 for Multi-View TRGRU: Transformer based Spatiotemporal Model for Short-Term Metro Origin-Destination Matrix Prediction

Figure 3 for Multi-View TRGRU: Transformer based Spatiotemporal Model for Short-Term Metro Origin-Destination Matrix Prediction

Figure 4 for Multi-View TRGRU: Transformer based Spatiotemporal Model for Short-Term Metro Origin-Destination Matrix Prediction

Abstract:Accurate prediction of short-term OD Matrix (i.e. the distribution of passenger flows from various origins to destinations) is a crucial task in metro systems. It is highly challenging due to the constantly changing nature of many impacting factors and the real-time de- layed data collection problem. Recently, some deep learning-based models have been proposed for OD Matrix forecasting in ride- hailing and high way traffic scenarios. However, these models can not sufficiently capture the complex spatiotemporal correlation between stations in metro networks due to their different prior knowledge and contextual settings. In this paper we propose a hy- brid framework Multi-view TRGRU to address OD metro matrix prediction. In particular, it uses three modules to model three flow change patterns: recent trend, daily trend, weekly trend. In each module, a multi-view representation based on embedding for each station is constructed and fed into a transformer based gated re- current structure so as to capture the dynamic spatial dependency in OD flows of different stations by a global self-attention mecha- nism. Extensive experiments on three large-scale, real-world metro datasets demonstrate the superiority of our Multi-view TRGRU over other competitors.

* 10 pages, 8 figures

Via

Access Paper or Ask Questions

Incorporating Reachability Knowledge into a Multi-Spatial Graph Convolution Based Seq2Seq Model for Traffic Forecasting

Jul 04, 2021

Jiexia Ye, Furong Zheng, Juanjuan Zhao, Kejiang Ye, Chengzhong Xu

Figure 1 for Incorporating Reachability Knowledge into a Multi-Spatial Graph Convolution Based Seq2Seq Model for Traffic Forecasting

Figure 2 for Incorporating Reachability Knowledge into a Multi-Spatial Graph Convolution Based Seq2Seq Model for Traffic Forecasting

Figure 3 for Incorporating Reachability Knowledge into a Multi-Spatial Graph Convolution Based Seq2Seq Model for Traffic Forecasting

Figure 4 for Incorporating Reachability Knowledge into a Multi-Spatial Graph Convolution Based Seq2Seq Model for Traffic Forecasting

Abstract:Accurate traffic state prediction is the foundation of transportation control and guidance. It is very challenging due to the complex spatiotemporal dependencies in traffic data. Existing works cannot perform well for multi-step traffic prediction that involves long future time period. The spatiotemporal information dilution becomes serve when the time gap between input step and predicted step is large, especially when traffic data is not sufficient or noisy. To address this issue, we propose a multi-spatial graph convolution based Seq2Seq model. Our main novelties are three aspects: (1) We enrich the spatiotemporal information of model inputs by fusing multi-view features (time, location and traffic states) (2) We build multiple kinds of spatial correlations based on both prior knowledge and data-driven knowledge to improve model performance especially in insufficient or noisy data cases. (3) A spatiotemporal attention mechanism based on reachability knowledge is novelly designed to produce high-level features fed into decoder of Seq2Seq directly to ease information dilution. Our model is evaluated on two real world traffic datasets and achieves better performance than other competitors.

* 12 pages, 9 figures

Via

Access Paper or Ask Questions

How to Build a Graph-Based Deep Learning Architecture in Traffic Domain: A Survey

Jun 07, 2020

Jiexia Ye, Juanjuan Zhao, Kejiang Ye, Chengzhong Xu

Figure 1 for How to Build a Graph-Based Deep Learning Architecture in Traffic Domain: A Survey

Figure 2 for How to Build a Graph-Based Deep Learning Architecture in Traffic Domain: A Survey

Figure 3 for How to Build a Graph-Based Deep Learning Architecture in Traffic Domain: A Survey

Figure 4 for How to Build a Graph-Based Deep Learning Architecture in Traffic Domain: A Survey

Abstract:In recent years, various deep learning architectures have been proposed to solve complex challenges (e.g., spatial dependency, temporal dependency) in traffic domain, which have achieved satisfactory performance. These architectures are composed of multiple deep learning techniques in order to tackle various challenges in traffic data. Traditionally, convolution neural networks (CNNs) are utilized to model spatial dependency by decomposing the traffic network as grids. However, many traffic networks are graph-structured in nature. In order to utilize such spatial information fully, it's more appropriate to formulate traffic networks as graphs mathematically. Recently, various novel deep learning techniques have been developed to process graph data, called graph neural networks (GNNs). More and more works combine GNNs with other deep learning techniques to construct an architecture dealing with various challenges in a complex traffic task, where GNNs are responsible for extracting spatial correlations in traffic network. These graph-based architectures have achieved state-of-the-art performance. To provide a comprehensive and clear picture of such emerging trend, this survey carefully examines various graph-based deep learning architectures in many traffic applications. We first give guidelines to formulate a traffic problem based on graph and construct graphs from various traffic data. Then we decompose these graph-based architectures to discuss their shared deep learning techniques, clarifying the utilization of each technique in traffic tasks. What's more, we summarize common traffic challenges and the corresponding graph-based deep learning solutions to each challenge. Finally, we provide benchmark datasets, open source codes and future research directions in this rapidly growing field.

* 21pages, 11figures

Via

Access Paper or Ask Questions

Multi-View Graph Convolutional Networks for Relationship-Driven Stock Prediction

May 11, 2020

Jiexia Ye, Juanjuan Zhao, Kejiang Ye, Chengzhong Xu

Figure 1 for Multi-View Graph Convolutional Networks for Relationship-Driven Stock Prediction

Figure 2 for Multi-View Graph Convolutional Networks for Relationship-Driven Stock Prediction

Figure 3 for Multi-View Graph Convolutional Networks for Relationship-Driven Stock Prediction

Figure 4 for Multi-View Graph Convolutional Networks for Relationship-Driven Stock Prediction

Abstract:Stock price movement prediction is commonly accepted as a very challenging task due to the extremely volatile nature of financial markets. Previous works typically focus on understanding the temporal dependency of stock price movement based on the history of individual stock movement, but they do not take the complex relationships among involved stocks into consideration. However it is well known that an individual stock price is correlated with prices of other stocks. To address that, we propose a deep learning-based framework, which utilizes recurrent neural network (RNN) and graph convolutional network (GCN) to predict stock movement. Specifically, we first use RNN to model the temporal dependency of each related stock' price movement based on their own information of the past time slices, then we employ GCN to model the influence from involved stock based on three novel graphs which represent the shareholder relationship, industry relationship and concept relationship among stocks based on investment decisions. Experiments on two stock indexes in China market show that our model outperforms other baselines. To our best knowledge, it is the first time to incorporate multi-relationships among involved stocks into a GCN based deep learning framework for predicting stock price movement.

Via

Access Paper or Ask Questions