Abstract:This study introduces a novel approach to Industrial Asset Management (IAM) by incorporating Conditional-Based Management (CBM) principles with the latest advancements in Large Language Models (LLMs). Our research introduces an automated model-building process, traditionally reliant on intensive collaboration between data scientists and domain experts. We present two primary innovations: a taxonomy-guided prompting generation that facilitates the automatic creation of AI solution recipes and a set of LLM pipelines designed to produce a solution recipe containing a set of artifacts composed of documents, sample data, and models for IAM. These pipelines, guided by standardized principles, enable the generation of initial solution templates for heterogeneous asset classes without direct human input, reducing reliance on extensive domain knowledge and enhancing automation. We evaluate our methodology by assessing asset health and sustainability across a spectrum of ten asset classes. Our findings illustrate the potential of LLMs and taxonomy-based LLM prompting pipelines in transforming asset management, offering a blueprint for subsequent research and development initiatives to be integrated into a rapid client solution.
Abstract:Query suggestion refers to the task of suggesting relevant and related queries to a search engine user to help in query formulation process and to expedite information retrieval with minimum amount of effort. It is highly useful in situations where the search requirements are not well understood and hence it has been widely adopted by search engines to guide users' search activity. For news websites, user queries have a time sensitive nature inherent in them. When some new event happens, there is a sudden burst in queries related to that event and such queries are sustained over a period of time before fading away with that event. In addition to this temporal aspect of search queries fired at news websites, they have an addition distinct quality, i.e., they are intended to get event related information majority of the times. Existing work on generating query suggestions involves analyzing query logs to suggest queries which are relevant and related to the search intent of the user. But in case of news websites, when there is a sudden burst in information related to a particular event, there are not enough search queries fired by other users which leads to lack of click data, and hence giving query suggestions related to some old event or even some irrelevant suggestions altogether. Another problem with query logs in the context of online news is that, they mostly contain queries related to popular events and hence fail to capture less popular events or events which got overshadowed by some other more sensational event. We propose a novel approach to generate event-centric query suggestions using metadata of news articles published by news media. We compared our proposed framework with existing state of the art query suggestion mechanisms provided by Google News, Bing News, Google Search and Bing Search on various parameters.
Abstract:Human behavior modeling deals with learning and understanding behavior patterns inherent in humans' daily routines. Existing pattern mining techniques either assume human dynamics is strictly periodic, or require the number of modes as input, or do not consider uncertainty in the sensor data. To handle these issues, in this paper, we propose a novel clustering approach for modeling human behavior (named, MTpattern) from time-series data. For mining frequent human behavior patterns effectively, we utilize a three-stage pipeline: (1) represent time series data into a sequence of regularly sampled equal-sized unit time intervals for better analysis, (2) a new distance measure scheme is proposed to cluster similar sequences which can handle temporal variation and uncertainty in the data, and (3) exploit an exemplar-based clustering mechanism and fine-tune its parameters to output minimum number of clusters with given permissible distance constraints and without knowing the number of modes present in the data. Then, the average of all sequences in a cluster is considered as a human behavior pattern. Empirical studies on two real-world datasets and a simulated dataset demonstrate the effectiveness of MTpattern with respect to internal and external measures of clustering.
Abstract:A large number of time series forecasting models including traditional statistical models, machine learning models and more recently deep learning have been proposed in the literature. However, choosing the right model along with good parameter values that performs well on a given data is still challenging. Automatically providing a good set of models to users for a given dataset saves both time and effort from using trial-and-error approaches with a wide variety of available models along with parameter optimization. We present AutoAI for Time Series Forecasting (AutoAI-TS) that provides users with a zero configuration (zero-conf ) system to efficiently train, optimize and choose best forecasting model among various classes of models for the given dataset. With its flexible zero-conf design, AutoAI-TS automatically performs all the data preparation, model creation, parameter optimization, training and model selection for users and provides a trained model that is ready to use. For given data, AutoAI-TS utilizes a wide variety of models including classical statistical models, Machine Learning (ML) models, statistical-ML hybrid models and deep learning models along with various transformations to create forecasting pipelines. It then evaluates and ranks pipelines using the proposed T-Daub mechanism to choose the best pipeline. The paper describe in detail all the technical aspects of AutoAI-TS along with extensive benchmarking on a variety of real world data sets for various use-cases. Benchmark results show that AutoAI-TS, with no manual configuration from the user, automatically trains and selects pipelines that on average outperform existing state-of-the-art time series forecasting toolkits.
Abstract:In this work we propose for the first time a transformer-based framework for unsupervised representation learning of multivariate time series. Pre-trained models can be potentially used for downstream tasks such as regression and classification, forecasting and missing value imputation. By evaluating our models on several benchmark datasets for multivariate time series regression and classification, we show that not only does our modeling approach represent the most successful method employing unsupervised learning of multivariate time series presented to date, but also that it exceeds the current state-of-the-art performance of supervised methods; it does so even when the number of training samples is very limited, while offering computational efficiency. Finally, we demonstrate that unsupervised pre-training of our transformer models offers a substantial performance benefit over fully supervised learning, even without leveraging additional unlabeled data, i.e., by reusing the same data samples through the unsupervised objective.