Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Richi Nayak

QUT

ALGAN: Time Series Anomaly Detection with Adjusted-LSTM GAN

Aug 13, 2023

Md Abul Bashar, Richi Nayak

Abstract:Anomaly detection in time series data, to identify points that deviate from normal behaviour, is a common problem in various domains such as manufacturing, medical imaging, and cybersecurity. Recently, Generative Adversarial Networks (GANs) are shown to be effective in detecting anomalies in time series data. The neural network architecture of GANs (i.e. Generator and Discriminator) can significantly improve anomaly detection accuracy. In this paper, we propose a new GAN model, named Adjusted-LSTM GAN (ALGAN), which adjusts the output of an LSTM network for improved anomaly detection in both univariate and multivariate time series data in an unsupervised setting. We evaluate the performance of ALGAN on 46 real-world univariate time series datasets and a large multivariate dataset that spans multiple domains. Our experiments demonstrate that ALGAN outperforms traditional, neural network-based, and other GAN-based methods for anomaly detection in time series data.

Via

Access Paper or Ask Questions

Informed Machine Learning, Centrality, CNN, Relevant Document Detection, Repatriation of Indigenous Human Remains

Mar 25, 2023

Md Abul Bashar, Richi Nayak, Gareth Knapman, Paul Turnbull, Cressida Fforde

Abstract:Among the pressing issues facing Australian and other First Nations peoples is the repatriation of the bodily remains of their ancestors, which are currently held in Western scientific institutions. The success of securing the return of these remains to their communities for reburial depends largely on locating information within scientific and other literature published between 1790 and 1970 documenting their theft, donation, sale, or exchange between institutions. This article reports on collaborative research by data scientists and social science researchers in the Research, Reconcile, Renew Network (RRR) to develop and apply text mining techniques to identify this vital information. We describe our work to date on developing a machine learning-based solution to automate the process of finding and semantically analysing relevant texts. Classification models, particularly deep learning-based models, are known to have low accuracy when trained with small amounts of labelled (i.e. relevant/non-relevant) documents. To improve the accuracy of our detection model, we explore the use of an Informed Neural Network (INN) model that describes documentary content using expert-informed contextual knowledge. Only a few labelled documents are used to provide specificity to the model, using conceptually related keywords identified by RRR experts in provenance research. The results confirm the value of using an INN network model for identifying relevant documents related to the investigation of the global commercial trade in Indigenous human remains. Empirical analysis suggests that this INN model can be generalized for use by other researchers in the social sciences and humanities who want to extract relevant information from large textual corpora.

* Social Science Computer Review (2023)
* Accepted Version of the Journal Article

Via

Access Paper or Ask Questions

Unsupervised Visual Time-Series Representation Learning and Clustering

Nov 19, 2021

Gaurangi Anand, Richi Nayak

Figure 1 for Unsupervised Visual Time-Series Representation Learning and Clustering

Figure 2 for Unsupervised Visual Time-Series Representation Learning and Clustering

Figure 3 for Unsupervised Visual Time-Series Representation Learning and Clustering

Figure 4 for Unsupervised Visual Time-Series Representation Learning and Clustering

Abstract:Time-series data is generated ubiquitously from Internet-of-Things (IoT) infrastructure, connected and wearable devices, remote sensing, autonomous driving research and, audio-video communications, in enormous volumes. This paper investigates the potential of unsupervised representation learning for these time-series. In this paper, we use a novel data transformation along with novel unsupervised learning regime to transfer the learning from other domains to time-series where the former have extensive models heavily trained on very large labelled datasets. We conduct extensive experiments to demonstrate the potential of the proposed approach through time-series clustering.

* 9 pages, 4 figures, International Conference on Neural Information Processing. Springer, Cham, (2020) submitted version

Via

Access Paper or Ask Questions

Nonnegative Matrix Factorization to understand Spatio-Temporal Traffic Pattern Variations during COVID-19: A Case Study

Nov 05, 2021

Anandkumar Balasubramaniam, Thirunavukarasu Balasubramaniam, Rathinaraja Jeyaraj, Anand Paul, Richi Nayak

Figure 1 for Nonnegative Matrix Factorization to understand Spatio-Temporal Traffic Pattern Variations during COVID-19: A Case Study

Figure 2 for Nonnegative Matrix Factorization to understand Spatio-Temporal Traffic Pattern Variations during COVID-19: A Case Study

Figure 3 for Nonnegative Matrix Factorization to understand Spatio-Temporal Traffic Pattern Variations during COVID-19: A Case Study

Figure 4 for Nonnegative Matrix Factorization to understand Spatio-Temporal Traffic Pattern Variations during COVID-19: A Case Study

Abstract:Due to the rapid developments in Intelligent Transportation System (ITS) and increasing trend in the number of vehicles on road, abundant of road traffic data is generated and available. Understanding spatio-temporal traffic patterns from this data is crucial and has been effectively helping in traffic plannings, road constructions, etc. However, understanding traffic patterns during COVID-19 pandemic is quite challenging and important as there is a huge difference in-terms of people's and vehicle's travel behavioural patterns. In this paper, a case study is conducted to understand the variations in spatio-temporal traffic patterns during COVID-19. We apply nonnegative matrix factorization (NMF) to elicit patterns. The NMF model outputs are analysed based on the spatio-temporal pattern behaviours observed during the year 2019 and 2020, which is before pandemic and during pandemic situations respectively, in Great Britain. The outputs of the analysed spatio-temporal traffic pattern variation behaviours will be useful in the fields of traffic management in Intelligent Transportation System and management in various stages of pandemic or unavoidable scenarios in-relation to road traffic.

* Accepted in the 19th Australasian Data Mining Conference 2021

Via

Access Paper or Ask Questions

A Semi-automatic Data Extraction System for Heterogeneous Data Sources: A Case Study from Cotton Industry

Nov 05, 2021

Richi Nayak, Thirunavukarasu Balasubramaniam, Sangeetha Kutty, Sachindra Banduthilaka, Erin Peterson

Figure 1 for A Semi-automatic Data Extraction System for Heterogeneous Data Sources: A Case Study from Cotton Industry

Figure 2 for A Semi-automatic Data Extraction System for Heterogeneous Data Sources: A Case Study from Cotton Industry

Figure 3 for A Semi-automatic Data Extraction System for Heterogeneous Data Sources: A Case Study from Cotton Industry

Figure 4 for A Semi-automatic Data Extraction System for Heterogeneous Data Sources: A Case Study from Cotton Industry

Abstract:With the recent developments in digitisation, there are increasing number of documents available online. There are several information extraction tools that are available to extract information from digitised documents. However, identifying precise answers to a given query is often a challenging task especially if the data source where the relevant information resides is unknown. This situation becomes more complex when the data source is available in multiple formats such as PDF, table and html. In this paper, we propose a novel data extraction system to discover relevant and focused information from diverse unstructured data sources based on text mining approaches. We perform a qualitative analysis to evaluate the proposed system and its suitability and adaptability using cotton industry.

* Accepted in the 19th Australasian Data Mining Conference 2021

Via

Access Paper or Ask Questions

Investigation of Topic Modelling Methods for Understanding the Reports of the Mining Projects in Queensland

Nov 05, 2021

Yasuko Okamoto, Thirunavukarasu Balasubramaniam, Richi Nayak

Figure 1 for Investigation of Topic Modelling Methods for Understanding the Reports of the Mining Projects in Queensland

Figure 2 for Investigation of Topic Modelling Methods for Understanding the Reports of the Mining Projects in Queensland

Figure 3 for Investigation of Topic Modelling Methods for Understanding the Reports of the Mining Projects in Queensland

Abstract:In the mining industry, many reports are generated in the project management process. These past documents are a great resource of knowledge for future success. However, it would be a tedious and challenging task to retrieve the necessary information if the documents are unorganized and unstructured. Document clustering is a powerful approach to cope with the problem, and many methods have been introduced in past studies. Nonetheless, there is no silver bullet that can perform the best for any types of documents. Thus, exploratory studies are required to apply the clustering methods for new datasets. In this study, we will investigate multiple topic modelling (TM) methods. The objectives are finding the appropriate approach for the mining project reports using the dataset of the Geological Survey of Queensland, Department of Resources, Queensland Government, and understanding the contents to get the idea of how to organise them. Three TM methods, Latent Dirichlet Allocation (LDA), Nonnegative Matrix Factorization (NMF), and Nonnegative Tensor Factorization (NTF) are compared statistically and qualitatively. After the evaluation, we conclude that the LDA performs the best for the dataset; however, the possibility remains that the other methods could be adopted with some improvements.

* Accepted in The 19th Australasian Data Mining Conference 2021

Via

Access Paper or Ask Questions

Deep Learning for Bias Detection: From Inception to Deployment

Oct 12, 2021

Md Abul Bashar, Richi Nayak, Anjor Kothare, Vishal Sharma, Kesavan Kandadai

Figure 1 for Deep Learning for Bias Detection: From Inception to Deployment

Figure 2 for Deep Learning for Bias Detection: From Inception to Deployment

Figure 3 for Deep Learning for Bias Detection: From Inception to Deployment

Figure 4 for Deep Learning for Bias Detection: From Inception to Deployment

Abstract:To create a more inclusive workplace, enterprises are actively investing in identifying and eliminating unconscious bias (e.g., gender, race, age, disability, elitism and religion) across their various functions. We propose a deep learning model with a transfer learning based language model to learn from manually tagged documents for automatically identifying bias in enterprise content. We first pretrain a deep learning-based language-model using Wikipedia, then fine tune the model with a large unlabelled data set related with various types of enterprise content. Finally, a linear layer followed by softmax layer is added at the end of the language model and the model is trained on a labelled bias dataset consisting of enterprise content. The trained model is thoroughly evaluated on independent datasets to ensure a general application. We present the proposed method and its deployment detail in a real-world application.

Via

Access Paper or Ask Questions

TAnoGAN: Time Series Anomaly Detection with Generative Adversarial Networks

Sep 25, 2020

Md Abul Bashar, Richi Nayak

Figure 1 for TAnoGAN: Time Series Anomaly Detection with Generative Adversarial Networks

Figure 2 for TAnoGAN: Time Series Anomaly Detection with Generative Adversarial Networks

Figure 3 for TAnoGAN: Time Series Anomaly Detection with Generative Adversarial Networks

Figure 4 for TAnoGAN: Time Series Anomaly Detection with Generative Adversarial Networks

Abstract:Anomaly detection in time series data is a significant problem faced in many application areas such as manufacturing, medical imaging and cyber-security. Recently, Generative Adversarial Networks (GAN) have gained attention for generation and anomaly detection in image domain. In this paper, we propose a novel GAN-based unsupervised method called TAnoGan for detecting anomalies in time series when a small number of data points are available. We evaluate TAnoGan with 46 real-world time series datasets that cover a variety of domains. Extensive experimental results show that TAnoGan performs better than traditional and neural network models.

* Made some minor changes. This is the accepted version of the paper at AusDM'20

Via

Access Paper or Ask Questions

Understanding the Spatio-temporal Topic Dynamics of Covid-19 using Nonnegative Tensor Factorization: A Case Study

Sep 19, 2020

Thirunavukarasu Balasubramaniam, Richi Nayak, Md Abul Bashar

Figure 1 for Understanding the Spatio-temporal Topic Dynamics of Covid-19 using Nonnegative Tensor Factorization: A Case Study

Figure 2 for Understanding the Spatio-temporal Topic Dynamics of Covid-19 using Nonnegative Tensor Factorization: A Case Study

Figure 3 for Understanding the Spatio-temporal Topic Dynamics of Covid-19 using Nonnegative Tensor Factorization: A Case Study

Figure 4 for Understanding the Spatio-temporal Topic Dynamics of Covid-19 using Nonnegative Tensor Factorization: A Case Study

Abstract:Social media platforms facilitate mankind a data-driven world by enabling billions of people to share their thoughts and activities ubiquitously. This huge collection of data, if analysed properly, can provide useful insights into people's behavior. More than ever, now is a crucial time under the Covid-19 pandemic to understand people's online behaviors detailing what topics are being discussed, and where (space) and when (time) they are discussed. Given the high complexity and poor quality of the huge social media data, an effective spatio-temporal topic detection method is needed. This paper proposes a tensor-based representation of social media data and Non-negative Tensor Factorization (NTF) to identify the topics discussed in social media data along with the spatio-temporal topic dynamics. A case study on Covid-19 related tweets from the Australia Twittersphere is presented to identify and visualize spatio-temporal topic dynamics on Covid-19

* Accepted in 18th Australasian Data Mining Conference (AusDM)

Via

Access Paper or Ask Questions

Learning Inter- and Intra-manifolds for Matrix Factorization-based Multi-Aspect Data Clustering

Sep 07, 2020

Khanh Luong, Richi Nayak

Figure 1 for Learning Inter- and Intra-manifolds for Matrix Factorization-based Multi-Aspect Data Clustering

Figure 2 for Learning Inter- and Intra-manifolds for Matrix Factorization-based Multi-Aspect Data Clustering

Figure 3 for Learning Inter- and Intra-manifolds for Matrix Factorization-based Multi-Aspect Data Clustering

Figure 4 for Learning Inter- and Intra-manifolds for Matrix Factorization-based Multi-Aspect Data Clustering

Abstract:Clustering on the data with multiple aspects, such as multi-view or multi-type relational data, has become popular in recent years due to their wide applicability. The approach using manifold learning with the Non-negative Matrix Factorization (NMF) framework, that learns the accurate low-rank representation of the multi-dimensional data, has shown effectiveness. We propose to include the inter-manifold in the NMF framework, utilizing the distance information of data points of different data types (or views) to learn the diverse manifold for data clustering. Empirical analysis reveals that the proposed method can find partial representations of various interrelated types and select useful features during clustering. Results on several datasets demonstrate that the proposed method outperforms the state-of-the-art multi-aspect data clustering methods in both accuracy and efficiency.

* 15 pages with appendices

Via

Access Paper or Ask Questions