Abstract:Bayesian Neural Networks (BNNs) offer robust uncertainty quantification in model predictions, but training them presents a significant computational challenge. This is mainly due to the problem of sampling multimodal posterior distributions using Markov Chain Monte Carlo (MCMC) sampling and variational inference algorithms. Moreover, the number of model parameters scales exponentially with additional hidden layers, neurons, and features in the dataset. Typically, a significant portion of these densely connected parameters are redundant and pruning a neural network not only improves portability but also has the potential for better generalisation capabilities. In this study, we address some of the challenges by leveraging MCMC sampling with network pruning to obtain compact probabilistic models having removed redundant parameters. We sample the posterior distribution of model parameters (weights and biases) and prune weights with low importance, resulting in a compact model. We ensure that the compact BNN retains its ability to estimate uncertainty via the posterior distribution while retaining the model training and generalisation performance accuracy by adapting post-pruning resampling. We evaluate the effectiveness of our MCMC pruning strategy on selected benchmark datasets for regression and classification problems through empirical result analysis. We also consider two coral reef drill-core lithology classification datasets to test the robustness of the pruning model in complex real-world datasets. We further investigate if refining compact BNN can retain any loss of performance. Our results demonstrate the feasibility of training and pruning BNNs using MCMC whilst retaining generalisation performance with over 75% reduction in network size. This paves the way for developing compact BNN models that provide uncertainty estimates for real-world applications.
Abstract:During the COVID-19 pandemic, community tensions intensified, fuelling Hinduphobic sentiments and discrimination against individuals of Hindu descent within India and worldwide. Large language models (LLMs) have become prominent in natural language processing (NLP) tasks and social media analysis, enabling longitudinal studies of platforms like X (formerly Twitter) for specific issues during COVID-19. We present an abuse detection and sentiment analysis framework that offers a longitudinal analysis of Hinduphobia on X (Twitter) during and after the COVID-19 pandemic. This framework assesses the prevalence and intensity of Hinduphobic discourse, capturing elements such as derogatory jokes and racist remarks through sentiment analysis and abuse detection from pre-trained and fine-tuned LLMs. Additionally, we curate and publish a "Hinduphobic COVID-19 X (Twitter) Dataset" of 8,000 tweets annotated for Hinduphobic abuse detection, which is used to fine-tune a BERT model, resulting in the development of the Hinduphobic BERT (HP-BERT) model. We then further fine-tune HP-BERT using the SenWave dataset for multi-label sentiment analysis. Our study encompasses approximately 27.4 million tweets from six countries, including Australia, Brazil, India, Indonesia, Japan, and the United Kingdom. Our findings reveal a strong correlation between spikes in COVID-19 cases and surges in Hinduphobic rhetoric, highlighting how political narratives, misinformation, and targeted jokes contributed to communal polarisation. These insights provide valuable guidance for developing strategies to mitigate communal tensions in future crises, both locally and globally. We advocate implementing automated monitoring and removal of such content on social media to curb divisive discourse.
Abstract:Uncertainty quantification is crucial in time series prediction, and quantile regression offers a valuable mechanism for uncertainty quantification which is useful for extreme value forecasting. Although deep learning models have been prominent in multi-step ahead prediction, the development and evaluation of quantile deep learning models have been limited. We present a novel quantile regression deep learning framework for multi-step time series prediction. In this way, we elevate the capabilities of deep learning models by incorporating quantile regression, thus providing a more nuanced understanding of predictive values. We provide an implementation of prominent deep learning models for multi-step ahead time series prediction and evaluate their performance under high volatility and extreme conditions. We include multivariate and univariate modelling, strategies and provide a comparison with conventional deep learning models from the literature. Our models are tested on two cryptocurrencies: Bitcoin and Ethereum, using daily close-price data and selected benchmark time series datasets. The results show that integrating a quantile loss function with deep learning provides additional predictions for selected quantiles without a loss in the prediction accuracy when compared to the literature. Our quantile model has the ability to handle volatility more effectively and provides additional information for decision-making and uncertainty quantification through the use of quantiles when compared to conventional deep learning models.
Abstract:Machine translation using large language models (LLMs) is having a significant global impact, making communication easier. Mandarin Chinese is the official language used for communication by the government, education institutes, and media in China. In this study, we provide an automated assessment of machine translation models with human experts using sentiment and semantic analysis. In order to demonstrate our framework, we select classic early twentieth-century novel 'The True Story of Ah Q' with selected Mandarin Chinese to English translations. We also us Google Translate to generate the given text into English and then conduct a chapter-wise sentiment analysis and semantic analysis to compare the extracted sentiments across the different translations. We utilise LLMs for semantic and sentiment analysis. Our results indicate that the precision of Google Translate differs both in terms of semantic and sentiment analysis when compared to human expert translations. We find that Google Translate is unable to translate some of the specific words or phrases in Chinese, such as Chinese traditional allusions. The mistranslations have to its lack of contextual significance and historical knowledge of China. Thus, this framework brought us some new insights about machine translation for Chinese Mandarin. The future work can explore other languages or types of texts with this framework.
Abstract:The COVID-19 pandemic has exacerbated xenophobia, particularly Sinophobia, leading to widespread discrimination against individuals of Chinese descent. Large language models (LLMs) are pre-trained deep learning models used for natural language processing (NLP) tasks. The ability of LLMs to understand and generate human-like text makes them particularly useful for analysing social media data to detect and evaluate sentiments. We present a sentiment analysis framework utilising LLMs for longitudinal sentiment analysis of the Sinophobic sentiments expressed in X (Twitter) during the COVID-19 pandemic. The results show a significant correlation between the spikes in Sinophobic tweets, Sinophobic sentiments and surges in COVID-19 cases, revealing that the evolution of the pandemic influenced public sentiment and the prevalence of Sinophobic discourse. Furthermore, the sentiment analysis revealed a predominant presence of negative sentiments, such as annoyance and denial, which underscores the impact of political narratives and misinformation shaping public opinion. The lack of empathetic sentiment which was present in previous studies related to COVID-19 highlights the way the political narratives in media viewed the pandemic and how it blamed the Chinese community. Our study highlights the importance of transparent communication in mitigating xenophobic sentiments during global crises.
Abstract:During the COVID-19 pandemic, the news media coverage encompassed a wide range of topics that includes viral transmission, allocation of medical resources, and government response measures. There have been studies on sentiment analysis of social media platforms during COVID-19 to understand the public response given the rise of cases and government strategies implemented to control the spread of the virus. Sentiment analysis can provide a better understanding of changes in societal opinions and emotional trends during the pandemic. Apart from social media, newspapers have played a vital role in the dissemination of information, including information from the government, experts, and also the public about various topics. A study of sentiment analysis of newspaper sources during COVID-19 for selected countries can give an overview of how the media covered the pandemic. In this study, we select The Guardian newspaper and provide a sentiment analysis during various stages of COVID-19 that includes initial transmission, lockdowns and vaccination. We employ novel large language models (LLMs) and refine them with expert-labelled sentiment analysis data. We also provide an analysis of sentiments experienced pre-pandemic for comparison. The results indicate that during the early pandemic stages, public sentiment prioritised urgent crisis response, later shifting focus to addressing the impact on health and the economy. In comparison with related studies about social media sentiment analyses, we found a discrepancy between The Guardian with dominance of negative sentiments (sad, annoyed, anxious and denial), suggesting that social media offers a more diversified emotional reflection. We found a grim narrative in The Guardian with overall dominance of negative sentiments, pre and during COVID-19 across news sections including Australia, UK, World News, and Opinion
Abstract:Supervised learning methods for geological mapping via remote sensing face limitations due to the scarcity of accurately labelled training data. In contrast, unsupervised learning methods, such as dimensionality reduction and clustering have the ability to uncover patterns and structures in remote sensing data without relying on predefined labels. Dimensionality reduction methods have the potential to play a crucial role in improving the accuracy of geological maps. Although conventional dimensionality reduction methods may struggle with nonlinear data, unsupervised deep learning models such as autoencoders have the ability to model nonlinear relationship in data. Stacked autoencoders feature multiple interconnected layers to capture hierarchical data representations that can be useful for remote sensing data. In this study, we present an unsupervised machine learning framework for processing remote sensing data by utilizing stacked autoencoders for dimensionality reduction and k-means clustering for mapping geological units. We use the Landsat-8, ASTER, and Sentinel-2 datasets of the Mutawintji region in Western New South Wales, Australia to evaluate the framework for geological mapping. We also provide a comparison of stacked autoencoders with principal component analysis and canonical autoencoders. Our results reveal that the framework produces accurate and interpretable geological maps, efficiently discriminating rock units. We find that the stacked autoencoders provide better accuracy when compared to the counterparts. We also find that the generated maps align with prior geological knowledge of the study area while providing novel insights into geological structures.
Abstract:The revolution of natural language processing via large language models has motivated its use in multidisciplinary areas that include social sciences and humanities and more specifically, comparative religion. Sentiment analysis provides a mechanism to study the emotions expressed in text. Recently, sentiment analysis has been used to study and compare translations of the Bhagavad Gita, which is a fundamental and sacred Hindu text. In this study, we use sentiment analysis for studying selected chapters of the Bible. These chapters are known as the Sermon on the Mount. We utilize a pre-trained language model for sentiment analysis by reviewing five translations of the Sermon on the Mount, which include the King James version, the New International Version, the New Revised Standard Version, the Lamsa Version, and the Basic English Version. We provide a chapter-by-chapter and verse-by-verse comparison using sentiment and semantic analysis and review the major sentiments expressed. Our results highlight the varying sentiments across the chapters and verses. We found that the vocabulary of the respective translations is significantly different. We detected different levels of humour, optimism, and empathy in the respective chapters that were used by Jesus to deliver his message.
Abstract:Cancer diagnosis is a well-studied problem in machine learning since early detection of cancer is often the determining factor in prognosis. Supervised deep learning achieves excellent results in cancer image classification, usually through transfer learning. However, these models require large amounts of labelled data and for several types of cancer, large labelled datasets do not exist. In this paper, we demonstrate that a model pre-trained using a self-supervised learning algorithm known as Barlow Twins can outperform the conventional supervised transfer learning pipeline. We juxtapose two base models: i) pretrained in a supervised fashion on ImageNet; ii) pretrained in a self-supervised fashion on ImageNet. Both are subsequently fine tuned on a small labelled skin lesion dataset and evaluated on a large test set. We achieve a mean test accuracy of 70\% for self-supervised transfer in comparison to 66\% for supervised transfer. Interestingly, boosting performance further is possible by self-supervised pretraining a second time (on unlabelled skin lesion images) before subsequent fine tuning. This hints at an alternative path to collecting more labelled data in settings where this is challenging - namely just collecting more unlabelled images. Our framework is applicable to cancer image classification models in the low-labelled data regime.
Abstract:Pedestrian trajectory prediction plays an important role in autonomous driving systems and robotics. Recent work utilising prominent deep learning models for pedestrian motion prediction makes limited a priori assumptions about human movements, resulting in a lack of explainability and explicit constraints enforced on predicted trajectories. This paper presents a dynamics-based deep learning framework where a novel asymptotically stable dynamical system is integrated into a deep learning model. Our novel asymptotically stable dynamical system is used to model human goal-targeted motion by enforcing the human walking trajectory converges to a predicted goal position and provides a deep learning model with prior knowledge and explainability. Our deep learning model utilises recent innovations from transformer networks and is used to learn some features of human motion, such as collision avoidance, for our proposed dynamical system. The experimental results show that our framework outperforms recent prominent models in pedestrian trajectory prediction on five benchmark human motion datasets.