Abstract:Time series forecasting is prevalent in extensive real-world applications, such as financial analysis and energy planning. Previous studies primarily focus on time series modality, endeavoring to capture the intricate variations and dependencies inherent in time series. Beyond numerical time series data, we notice that metadata (e.g.~dataset and variate descriptions) also carries valuable information essential for forecasting, which can be used to identify the application scenario and provide more interpretable knowledge than digit sequences. Inspired by this observation, we propose a Metadata-informed Time Series Transformer (MetaTST), which incorporates multiple levels of context-specific metadata into Transformer forecasting models to enable informative time series forecasting. To tackle the unstructured nature of metadata, MetaTST formalizes them into natural languages by pre-designed templates and leverages large language models (LLMs) to encode these texts into metadata tokens as a supplement to classic series tokens, resulting in an informative embedding. Further, a Transformer encoder is employed to communicate series and metadata tokens, which can extend series representations by metadata information for more accurate forecasting. This design also allows the model to adaptively learn context-specific patterns across various scenarios, which is particularly effective in handling large-scale, diverse-scenario forecasting tasks. Experimentally, MetaTST achieves state-of-the-art compared to advanced time series models and LLM-based methods on widely acknowledged short- and long-term forecasting benchmarks, covering both single-dataset individual and multi-dataset joint training settings.
Abstract:Time series, characterized by a sequence of data points arranged in a discrete-time order, are ubiquitous in real-world applications. Different from other modalities, time series present unique challenges due to their complex and dynamic nature, including the entanglement of nonlinear patterns and time-variant trends. Analyzing time series data is of great significance in real-world scenarios and has been widely studied over centuries. Recent years have witnessed remarkable breakthroughs in the time series community, with techniques shifting from traditional statistical methods to advanced deep learning models. In this paper, we delve into the design of deep time series models across various analysis tasks and review the existing literature from two perspectives: basic modules and model architectures. Further, we develop and release Time Series Library (TSLib) as a fair benchmark of deep time series models for diverse analysis tasks, which implements 24 mainstream models, covers 30 datasets from different domains, and supports five prevalent analysis tasks. Based on TSLib, we thoroughly evaluate 12 advanced deep time series models on different tasks. Empirical results indicate that models with specific structures are well-suited for distinct analytical tasks, which offers insights for research and adoption of deep time series models. Code is available at https://github.com/thuml/Time-Series-Library.
Abstract:Deep models have recently emerged as a promising tool to solve partial differential equations (PDEs), known as neural PDE solvers. While neural solvers trained from either simulation data or physics-informed loss can solve the PDEs reasonably well, they are mainly restricted to a specific set of PDEs, e.g. a certain equation or a finite set of coefficients. This bottleneck limits the generalizability of neural solvers, which is widely recognized as its major advantage over numerical solvers. In this paper, we present the Universal PDE solver (Unisolver) capable of solving a wide scope of PDEs by leveraging a Transformer pre-trained on diverse data and conditioned on diverse PDEs. Instead of simply scaling up data and parameters, Unisolver stems from the theoretical analysis of the PDE-solving process. Our key finding is that a PDE solution is fundamentally under the control of a series of PDE components, e.g. equation symbols, coefficients, and initial and boundary conditions. Inspired by the mathematical structure of PDEs, we define a complete set of PDE components and correspondingly embed them as domain-wise (e.g. equation symbols) and point-wise (e.g. boundaries) conditions for Transformer PDE solvers. Integrating physical insights with recent Transformer advances, Unisolver achieves consistent state-of-the-art results on three challenging large-scale benchmarks, showing impressive gains and endowing favorable generalizability and scalability.
Abstract:Time series forecasting is widely used in extensive applications, such as traffic planning and weather forecasting. However, real-world time series usually present intricate temporal variations, making forecasting extremely challenging. Going beyond the mainstream paradigms of plain decomposition and multiperiodicity analysis, we analyze temporal variations in a novel view of multiscale-mixing, which is based on an intuitive but important observation that time series present distinct patterns in different sampling scales. The microscopic and the macroscopic information are reflected in fine and coarse scales respectively, and thereby complex variations can be inherently disentangled. Based on this observation, we propose TimeMixer as a fully MLP-based architecture with Past-Decomposable-Mixing (PDM) and Future-Multipredictor-Mixing (FMM) blocks to take full advantage of disentangled multiscale series in both past extraction and future prediction phases. Concretely, PDM applies the decomposition to multiscale series and further mixes the decomposed seasonal and trend components in fine-to-coarse and coarse-to-fine directions separately, which successively aggregates the microscopic seasonal and macroscopic trend information. FMM further ensembles multiple predictors to utilize complementary forecasting capabilities in multiscale observations. Consequently, TimeMixer is able to achieve consistent state-of-the-art performances in both long-term and short-term forecasting tasks with favorable run-time efficiency.
Abstract:Physics-informed neural networks (PINNs) have been widely applied to solve partial differential equations (PDEs) by enforcing outputs and gradients of deep models to satisfy target equations. Due to the limitation of numerical computation, PINNs are conventionally optimized on finite selected points. However, since PDEs are usually defined on continuous domains, solely optimizing models on scattered points may be insufficient to obtain an accurate solution for the whole domain. To mitigate this inherent deficiency of the default scatter-point optimization, this paper proposes and theoretically studies a new training paradigm as region optimization. Concretely, we propose to extend the optimization process of PINNs from isolated points to their continuous neighborhood regions, which can theoretically decrease the generalization error, especially for hidden high-order constraints of PDEs. A practical training algorithm, Region Optimized PINN (RoPINN), is seamlessly derived from this new paradigm, which is implemented by a straightforward but effective Monte Carlo sampling method. By calibrating the sampling process into trust regions, RoPINN finely balances sampling efficiency and generalization error. Experimentally, RoPINN consistently boosts the performance of diverse PINNs on a wide range of PDEs without extra backpropagation or gradient calculation.
Abstract:Recent studies have demonstrated remarkable performance in time series forecasting. However, due to the partially-observed nature of real-world applications, solely focusing on the target of interest, so-called endogenous variables, is usually insufficient to guarantee accurate forecasting. Notably, a system is often recorded into multiple variables, where the exogenous series can provide valuable external information for endogenous variables. Thus, unlike prior well-established multivariate or univariate forecasting that either treats all the variables equally or overlooks exogenous information, this paper focuses on a practical setting, which is time series forecasting with exogenous variables. We propose a novel framework, TimeXer, to utilize external information to enhance the forecasting of endogenous variables. With a deftly designed embedding layer, TimeXer empowers the canonical Transformer architecture with the ability to reconcile endogenous and exogenous information, where patch-wise self-attention and variate-wise cross-attention are employed. Moreover, a global endogenous variate token is adopted to effectively bridge the exogenous series into endogenous temporal patches. Experimentally, TimeXer significantly improves time series forecasting with exogenous variables and achieves consistent state-of-the-art performance in twelve real-world forecasting benchmarks.
Abstract:Transformers have empowered many milestones across various fields and have recently been applied to solve partial differential equations (PDEs). However, since PDEs are typically discretized into large-scale meshes with complex geometries, it is challenging for Transformers to capture intricate physical correlations directly from massive individual points. Going beyond superficial and unwieldy meshes, we present Transolver based on a more foundational idea, which is learning intrinsic physical states hidden behind discretized geometries. Specifically, we propose a new Physics-Attention to adaptively split the discretized domain into a series of learnable slices of flexible shapes, where mesh points under similar physical states will be ascribed to the same slice. By calculating attention to physics-aware tokens encoded from slices, Transovler can effectively capture intricate physical correlations under complex geometrics, which also empowers the solver with endogenetic geometry-general modeling capacity and can be efficiently computed in linear complexity. Transolver achieves consistent state-of-the-art with 22\% relative gain across six standard benchmarks and also excels in large-scale industrial simulations, including car and airfoil designs.
Abstract:Time series pre-training has recently garnered wide attention for its potential to reduce labeling expenses and benefit various downstream tasks. Prior methods are mainly based on pre-training techniques well-acknowledged in vision or language, such as masked modeling and contrastive learning. However, randomly masking time series or calculating series-wise similarity will distort or neglect inherent temporal correlations crucial in time series data. To emphasize temporal correlation modeling, this paper proposes TimeSiam as a simple but effective self-supervised pre-training framework for Time series based on Siamese networks. Concretely, TimeSiam pre-trains Siamese encoders to capture intrinsic temporal correlations between randomly sampled past and current subseries. With a simple data augmentation method (e.g.~masking), TimeSiam can benefit from diverse augmented subseries and learn internal time-dependent representations through a past-to-current reconstruction. Moreover, learnable lineage embeddings are also introduced to distinguish temporal distance between sampled series and further foster the learning of diverse temporal correlations. TimeSiam consistently outperforms extensive advanced pre-training baselines, demonstrating superior forecasting and classification capabilities across 13 standard benchmarks in both intra- and cross-domain scenarios.
Abstract:Accurately predicting the future fluid is important to extensive areas, such as meteorology, oceanology and aerodynamics. However, since the fluid is usually observed from an Eulerian perspective, its active and intricate dynamics are seriously obscured and confounded in static grids, bringing horny challenges to the prediction. This paper introduces a new Lagrangian-guided paradigm to tackle the tanglesome fluid dynamics. Instead of solely predicting the future based on Eulerian observations, we propose the Eulerian-Lagrangian Dual Recurrent Network (EuLagNet), which captures multiscale fluid dynamics by tracking movements of adaptively sampled key particles on multiple scales and integrating dynamics information over time. Concretely, a EuLag Block is presented to communicate the learned Eulerian and Lagrangian features at each moment and scale, where the motion of tracked particles is inferred from Eulerian observations and their accumulated dynamics information is incorporated into Eulerian fields to guide future prediction. Tracking key particles not only provides a clear and interpretable clue for fluid dynamics but also makes our model free from modeling complex correlations among massive grids for better efficiency. Experimentally, EuLagNet excels in three challenging fluid prediction tasks, covering both 2D and 3D, simulated and real-world fluids.
Abstract:Fluid simulation is a long-standing challenge due to the intrinsic high-dimensional non-linear dynamics. Previous methods usually utilize the non-linear modeling capability of deep models to directly estimate velocity fields for future prediction. However, skipping over inherent physical properties but directly learning superficial velocity fields will overwhelm the model from generating precise or physics-reliable results. In this paper, we propose the HelmSim toward an accurate and interpretable simulator for fluid. Inspired by the Helmholtz theorem, we design a HelmDynamic block to learn the Helmholtz dynamics, which decomposes fluid dynamics into more solvable curl-free and divergence-free parts, physically corresponding to potential and stream functions of fluid. By embedding the HelmDynamic block into a Multiscale Integration Network, HelmSim can integrate learned Helmholtz dynamics along temporal dimension in multiple spatial scales to yield future fluid. Comparing with previous velocity estimating methods, HelmSim is faithfully derived from Helmholtz theorem and ravels out complex fluid dynamics with physically interpretable evidence. Experimentally, our proposed HelmSim achieves the consistent state-of-the-art in both numerical simulated and real-world observed benchmarks, even for scenarios with complex boundaries.