Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Yan Lin

Fellow, IEEE

Log Optimization Simplification Method for Predicting Remaining Time

Mar 10, 2025

Jianhong Ye, Siyuan Zhang, Yan Lin

Abstract:Information systems generate a large volume of event log data during business operations, much of which consists of low-value and redundant information. When performance predictions are made directly from these logs, the accuracy of the predictions can be compromised. Researchers have explored methods to simplify and compress these data while preserving their valuable components. Most existing approaches focus on reducing the dimensionality of the data by eliminating redundant and irrelevant features. However, there has been limited investigation into the efficiency of execution both before and after event log simplification. In this paper, we present a prediction point selection algorithm designed to avoid the simplification of all points that function similarly. We select sequences or self-loop structures to form a simplifiable segment, and we optimize the deviation between the actual simplifiable value and the original data prediction value to prevent over-simplification. Experiments indicate that the simplified event log retains its predictive performance and, in some cases, enhances its predictive accuracy compared to the original event log.

Via

Access Paper or Ask Questions

DUET: Dual Clustering Enhanced Multivariate Time Series Forecasting

Dec 14, 2024

Xiangfei Qiu, Xingjian Wu, Yan Lin, Chenjuan Guo, Jilin Hu, Bin Yang

Figure 1 for DUET: Dual Clustering Enhanced Multivariate Time Series Forecasting

Figure 2 for DUET: Dual Clustering Enhanced Multivariate Time Series Forecasting

Figure 3 for DUET: Dual Clustering Enhanced Multivariate Time Series Forecasting

Figure 4 for DUET: Dual Clustering Enhanced Multivariate Time Series Forecasting

Abstract:Multivariate time series forecasting is crucial for various applications, such as financial investment, energy management, weather forecasting, and traffic optimization. However, accurate forecasting is challenging due to two main factors. First, real-world time series often show heterogeneous temporal patterns caused by distribution shifts over time. Second, correlations among channels are complex and intertwined, making it hard to model the interactions among channels precisely and flexibly. In this study, we address these challenges by proposing a general framework called \textbf{DUET}, which introduces \underline{DU}al clustering on the temporal and channel dimensions to \underline{E}nhance multivariate \underline{T}ime series forecasting. First, we design a Temporal Clustering Module (TCM) that clusters time series into fine-grained distributions to handle heterogeneous temporal patterns. For different distribution clusters, we design various pattern extractors to capture their intrinsic temporal patterns, thus modeling the heterogeneity. Second, we introduce a novel Channel-Soft-Clustering strategy and design a Channel Clustering Module (CCM), which captures the relationships among channels in the frequency domain through metric learning and applies sparsification to mitigate the adverse effects of noisy channels. Finally, DUET combines TCM and CCM to incorporate both the temporal and channel dimensions. Extensive experiments on 25 real-world datasets from 10 application domains, demonstrate the state-of-the-art performance of DUET.

* Accepted by KDD 2025

Via

Access Paper or Ask Questions

An Experimental Evaluation of Imputation Models for Spatial-Temporal Traffic Data

Dec 06, 2024

Shengnan Guo, Tonglong Wei, Yiheng Huang, Miaomiao Zhao, Ran Chen, Yan Lin, Youfang Lin, Huaiyu Wan

Figure 1 for An Experimental Evaluation of Imputation Models for Spatial-Temporal Traffic Data

Figure 2 for An Experimental Evaluation of Imputation Models for Spatial-Temporal Traffic Data

Figure 3 for An Experimental Evaluation of Imputation Models for Spatial-Temporal Traffic Data

Figure 4 for An Experimental Evaluation of Imputation Models for Spatial-Temporal Traffic Data

Abstract:Traffic data imputation is a critical preprocessing step in intelligent transportation systems, enabling advanced transportation services. Despite significant advancements in this field, selecting the most suitable model for practical applications remains challenging due to three key issues: 1) incomprehensive consideration of missing patterns that describe how data loss along spatial and temporal dimensions, 2) the lack of test on standardized datasets, and 3) insufficient evaluations. To this end, we first propose practice-oriented taxonomies for missing patterns and imputation models, systematically identifying all possible forms of real-world traffic data loss and analyzing the characteristics of existing models. Furthermore, we introduce a unified benchmarking pipeline to comprehensively evaluate 10 representative models across various missing patterns and rates. This work aims to provide a holistic understanding of traffic data imputation research and serve as a practical guideline.

Via

Access Paper or Ask Questions

Mobility-LLM: Learning Visiting Intentions and Travel Preferences from Human Mobility Data with Large Language Models

Oct 29, 2024

Letian Gong, Yan Lin, Xinyue Zhang, Yiwen Lu, Xuedi Han, Yichen Liu, Shengnan Guo, Youfang Lin, Huaiyu Wan

Figure 1 for Mobility-LLM: Learning Visiting Intentions and Travel Preferences from Human Mobility Data with Large Language Models

Figure 2 for Mobility-LLM: Learning Visiting Intentions and Travel Preferences from Human Mobility Data with Large Language Models

Figure 3 for Mobility-LLM: Learning Visiting Intentions and Travel Preferences from Human Mobility Data with Large Language Models

Figure 4 for Mobility-LLM: Learning Visiting Intentions and Travel Preferences from Human Mobility Data with Large Language Models

Abstract:Location-based services (LBS) have accumulated extensive human mobility data on diverse behaviors through check-in sequences. These sequences offer valuable insights into users' intentions and preferences. Yet, existing models analyzing check-in sequences fail to consider the semantics contained in these sequences, which closely reflect human visiting intentions and travel preferences, leading to an incomplete comprehension. Drawing inspiration from the exceptional semantic understanding and contextual information processing capabilities of large language models (LLMs) across various domains, we present Mobility-LLM, a novel framework that leverages LLMs to analyze check-in sequences for multiple tasks. Since LLMs cannot directly interpret check-ins, we reprogram these sequences to help LLMs comprehensively understand the semantics of human visiting intentions and travel preferences. Specifically, we introduce a visiting intention memory network (VIMN) to capture the visiting intentions at each record, along with a shared pool of human travel preference prompts (HTPP) to guide the LLM in understanding users' travel preferences. These components enhance the model's ability to extract and leverage semantic information from human mobility data effectively. Extensive experiments on four benchmark datasets and three downstream tasks demonstrate that our approach significantly outperforms existing models, underscoring the effectiveness of Mobility-LLM in advancing our understanding of human mobility data within LBS contexts.

* Accepted by NeurIPS2024

Via

Access Paper or Ask Questions

PTR: A Pre-trained Language Model for Trajectory Recovery

Oct 18, 2024

Tonglong Wei, Yan Lin, Youfang Lin, Shengnan Guo, Jilin Hu, Gao Cong, Huaiyu Wan

Abstract:Spatiotemporal trajectory data is vital for web-of-things services and is extensively collected and analyzed by web-based hardware and platforms. However, issues such as service interruptions and network instability often lead to sparsely recorded trajectories, resulting in a loss of detailed movement data. As a result, recovering these trajectories to restore missing information becomes essential. Despite progress, several challenges remain unresolved. First, the lack of large-scale dense trajectory data hampers the performance of existing deep learning methods, which rely heavily on abundant data for supervised training. Second, current methods struggle to generalize across sparse trajectories with varying sampling intervals, necessitating separate re-training for each interval and increasing computational costs. Third, external factors crucial for the recovery of missing points are not fully incorporated. To address these challenges, we propose a framework called PTR. This framework mitigates the issue of limited dense trajectory data by leveraging the capabilities of pre-trained language models (PLMs). PTR incorporates an explicit trajectory prompt and is trained on datasets with multiple sampling intervals, enabling it to generalize effectively across different intervals in sparse trajectories. To capture external factors, we introduce an implicit trajectory prompt that models road conditions, providing richer information for recovering missing points. Additionally, we present a trajectory embedder that encodes trajectory points and transforms the embeddings of both observed and missing points into a format comprehensible to PLMs. Experimental results on two public trajectory datasets with three sampling intervals demonstrate the efficacy and scalability of PTR.

Via

Access Paper or Ask Questions

DutyTTE: Deciphering Uncertainty in Origin-Destination Travel Time Estimation

Aug 23, 2024

Xiaowei Mao, Yan Lin, Shengnan Guo, Yubin Chen, Xingyu Xian, Haomin Wen, Qisen Xu, Youfang Lin, Huaiyu Wan

Abstract:Uncertainty quantification in travel time estimation (TTE) aims to estimate the confidence interval for travel time, given the origin (O), destination (D), and departure time (T). Accurately quantifying this uncertainty requires generating the most likely path and assessing travel time uncertainty along the path. This involves two main challenges: 1) Predicting a path that aligns with the ground truth, and 2) modeling the impact of travel time in each segment on overall uncertainty under varying conditions. We propose DutyTTE to address these challenges. For the first challenge, we introduce a deep reinforcement learning method to improve alignment between the predicted path and the ground truth, providing more accurate travel time information from road segments to improve TTE. For the second challenge, we propose a mixture of experts guided uncertainty quantification mechanism to better capture travel time uncertainty for each segment under varying contexts. Additionally, we calibrate our results using Hoeffding's upper-confidence bound to provide statistical guarantees for the estimated confidence intervals. Extensive experiments on two real-world datasets demonstrate the superiority of our proposed method.

* 7 pages

Via

Access Paper or Ask Questions

PTrajM: Efficient and Semantic-rich Trajectory Learning with Pretrained Trajectory-Mamba

Aug 09, 2024

Yan Lin, Yichen Liu, Zeyu Zhou, Haomin Wen, Erwen Zheng, Shengnan Guo, Youfang Lin, Huaiyu Wan

Figure 1 for PTrajM: Efficient and Semantic-rich Trajectory Learning with Pretrained Trajectory-Mamba

Figure 2 for PTrajM: Efficient and Semantic-rich Trajectory Learning with Pretrained Trajectory-Mamba

Figure 3 for PTrajM: Efficient and Semantic-rich Trajectory Learning with Pretrained Trajectory-Mamba

Figure 4 for PTrajM: Efficient and Semantic-rich Trajectory Learning with Pretrained Trajectory-Mamba

Abstract:Vehicle trajectories provide crucial movement information for various real-world applications. To better utilize vehicle trajectories, it is essential to develop a trajectory learning approach that can effectively and efficiently extract rich semantic information, including movement behavior and travel purposes, to support accurate downstream applications. However, creating such an approach presents two significant challenges. First, movement behavior are inherently spatio-temporally continuous, making them difficult to extract efficiently from irregular and discrete trajectory points. Second, travel purposes are related to the functionalities of areas and road segments traversed by vehicles. These functionalities are not available from the raw spatio-temporal trajectory features and are hard to extract directly from complex textual features associated with these areas and road segments. To address these challenges, we propose PTrajM, a novel method capable of efficient and semantic-rich vehicle trajectory learning. To support efficient modeling of movement behavior, we introduce Trajectory-Mamba as the learnable model of PTrajM, which effectively extracts continuous movement behavior while being more computationally efficient than existing structures. To facilitate efficient extraction of travel purposes, we propose a travel purpose-aware pre-training procedure, which enables PTrajM to discern the travel purposes of trajectories without additional computational resources during its embedding process. Extensive experiments on two real-world datasets and comparisons with several state-of-the-art trajectory learning methods demonstrate the effectiveness of PTrajM. Code is available at https://anonymous.4open.science/r/PTrajM-C973.

Via

Access Paper or Ask Questions

Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

Jul 25, 2024

Letian Gong, Huaiyu Wan, Shengnan Guo, Xiucheng Li, Yan Lin, Erwen Zheng, Tianyi Wang, Zeyu Zhou, Youfang Lin

Figure 1 for Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

Figure 2 for Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

Figure 3 for Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

Figure 4 for Spatial-Temporal Cross-View Contrastive Pre-training for Check-in Sequence Representation Learning

Abstract:The rapid growth of location-based services (LBS) has yielded massive amounts of data on human mobility. Effectively extracting meaningful representations for user-generated check-in sequences is pivotal for facilitating various downstream services. However, the user-generated check-in data are simultaneously influenced by the surrounding objective circumstances and the user's subjective intention. Specifically, the temporal uncertainty and spatial diversity exhibited in check-in data make it difficult to capture the macroscopic spatial-temporal patterns of users and to understand the semantics of user mobility activities. Furthermore, the distinct characteristics of the temporal and spatial information in check-in sequences call for an effective fusion method to incorporate these two types of information. In this paper, we propose a novel Spatial-Temporal Cross-view Contrastive Representation (STCCR) framework for check-in sequence representation learning. Specifically, STCCR addresses the above challenges by employing self-supervision from "spatial topic" and "temporal intention" views, facilitating effective fusion of spatial and temporal information at the semantic level. Besides, STCCR leverages contrastive clustering to uncover users' shared spatial topics from diverse mobility activities, while employing angular momentum contrast to mitigate the impact of temporal uncertainty and noise. We extensively evaluate STCCR on three real-world datasets and demonstrate its superior performance across three downstream tasks.

* This paper has been accepted as a regular paper at IEEE TKDE

Via

Access Paper or Ask Questions

UniTE: A Survey and Unified Pipeline for Pre-training ST Trajectory Embeddings

Jul 17, 2024

Yan Lin, Zeyu Zhou, Yicheng Liu, Haochen Lv, Haomin Wen, Tianyi Li, Yushuai Li, Christian S. Jensen, Shengnan Guo, Youfang Lin(+1 more)

Figure 1 for UniTE: A Survey and Unified Pipeline for Pre-training ST Trajectory Embeddings

Figure 2 for UniTE: A Survey and Unified Pipeline for Pre-training ST Trajectory Embeddings

Figure 3 for UniTE: A Survey and Unified Pipeline for Pre-training ST Trajectory Embeddings

Figure 4 for UniTE: A Survey and Unified Pipeline for Pre-training ST Trajectory Embeddings

Abstract:Spatio-temporal (ST) trajectories are sequences of timestamped locations, which enable a variety of analyses that in turn enable important real-world applications. It is common to map trajectories to vectors, called embeddings, before subsequent analyses. Thus, the qualities of embeddings are very important. Methods for pre-training embeddings, which leverage unlabeled trajectories for training universal embeddings, have shown promising applicability across different tasks, thus attracting considerable interest. However, research progress on this topic faces two key challenges: a lack of a comprehensive overview of existing methods, resulting in several related methods not being well-recognized, and the absence of a unified pipeline, complicating the development new methods and the analysis of methods. To overcome these obstacles and advance the field of pre-training of trajectory embeddings, we present UniTE, a survey and a unified pipeline for this domain. In doing so, we present a comprehensive list of existing methods for pre-training trajectory embeddings, which includes methods that either explicitly or implicitly employ pre-training techniques. Further, we present a unified and modular pipeline with publicly available underlying code, simplifying the process of constructing and evaluating methods for pre-training trajectory embeddings. Additionally, we contribute a selection of experimental results using the proposed pipeline on real-world datasets.

Via

Access Paper or Ask Questions

PLM4Traj: Cognizing Movement Patterns and Travel Purposes from Trajectories with Pre-trained Language Models

May 21, 2024

Zeyu Zhou, Yan Lin, Haomin Wen, Shengnan Guo, Jilin Hu, Youfang Lin, Huaiyu Wan

Figure 1 for PLM4Traj: Cognizing Movement Patterns and Travel Purposes from Trajectories with Pre-trained Language Models

Figure 2 for PLM4Traj: Cognizing Movement Patterns and Travel Purposes from Trajectories with Pre-trained Language Models

Figure 3 for PLM4Traj: Cognizing Movement Patterns and Travel Purposes from Trajectories with Pre-trained Language Models

Figure 4 for PLM4Traj: Cognizing Movement Patterns and Travel Purposes from Trajectories with Pre-trained Language Models

Abstract:Spatio-temporal trajectories play a vital role in various spatio-temporal data mining tasks. Developing a versatile trajectory learning approach that can adapt to different tasks while ensuring high accuracy is crucial. This requires effectively extracting movement patterns and travel purposes embedded in trajectories. However, this task is challenging due to limitations in the size and quality of available trajectory datasets. On the other hand, pre-trained language models (PLMs) have shown great success in adapting to different tasks by training on large-scale, high-quality corpus datasets. Given the similarities between trajectories and sentences, there is potential in leveraging PLMs to enhance the development of a versatile and effective trajectory learning method. Nevertheless, vanilla PLMs are not tailored to handle the unique spatio-temporal features present in trajectories and lack the capability to extract movement patterns and travel purposes from them. To overcome these obstacles, we propose a model called PLM4Traj that effectively utilizes PLMs to model trajectories. PLM4Traj leverages the strengths of PLMs to create a versatile trajectory learning approach while addressing the limitations of vanilla PLMs in modeling trajectories. Firstly, PLM4Traj incorporates a novel trajectory semantic embedder that enables PLMs to process spatio-temporal features in trajectories and extract movement patterns and travel purposes from them. Secondly, PLM4Traj introduces a novel trajectory prompt that integrates movement patterns and travel purposes into PLMs, while also allowing the model to adapt to various tasks. Extensive experiments conducted on two real-world datasets and two representative tasks demonstrate that PLM4Traj successfully achieves its design goals. Codes are available at https://github.com/Zeru19/PLM4Traj.

Via

Access Paper or Ask Questions