Abstract:The generalisation of Neural Networks (NN) to multiple datasets is often overlooked in literature due to NNs typically being optimised for specific data sources. This becomes especially challenging in time-series-based multi-dataset models due to difficulties in fusing sequential data from different sensors and collection specifications. In a commercial environment, however, generalisation can effectively utilise available data and computational power, which is essential in the context of Green AI, the sustainable development of AI models. This paper introduces "Dataset Fusion," a novel dataset composition algorithm for fusing periodic signals from multiple homogeneous datasets into a single dataset while retaining unique features for generalised anomaly detection. The proposed approach, tested on a case study of 3-phase current data from 2 different homogeneous Induction Motor (IM) fault datasets using an unsupervised LSTMCaps NN, significantly outperforms conventional training approaches with an Average F1 score of 0.879 and effectively generalises across all datasets. The proposed approach was also tested with varying percentages of the training data, in line with the principles of Green AI. Results show that using only 6.25\% of the training data, translating to a 93.7\% reduction in computational power, results in a mere 4.04\% decrease in performance, demonstrating the advantages of the proposed approach in terms of both performance and computational efficiency. Moreover, the algorithm's effectiveness under non-ideal conditions highlights its potential for practical use in real-world applications.
Abstract:Currently available dynamic optimization strategies for Ant Colony Optimization (ACO) algorithm offer a trade-off of slower algorithm convergence or significant penalty to solution quality after each dynamic change occurs. This paper proposes a discrete dynamic optimization strategy called Ant Colony Optimization (ACO) with Aphids, modelled after a real-world symbiotic relationship between ants and aphids. ACO with Aphids strategy is designed to improve solution quality of discrete domain Dynamic Optimization Problems (DOPs) with event-triggered discrete dynamism. The proposed strategy aims to improve the inter-state convergence rate throughout the entire dynamic optimization. It does so by minimizing the fitness penalty and maximizing the convergence speed that occurs after the dynamic change. This strategy is tested against Full-Restart and Pheromone-Sharing strategies implemented on the same ACO core algorithm solving Dynamic Multidimensional Knapsack Problem (DMKP) benchmarks. ACO with Aphids has demonstrated superior performance over the Pheromone-Sharing strategy in every test on average gap reduced by 29.2%. Also, ACO with Aphids has outperformed the Full-Restart strategy for large datasets groups, and the overall average gap is reduced by 52.5%.
Abstract:Deep learning techniques have recently shown promise in the field of anomaly detection, providing a flexible and effective method of modelling systems in comparison to traditional statistical modelling and signal processing-based methods. However, there are a few well publicised issues Neural Networks (NN)s face such as generalisation ability, requiring large volumes of labelled data to be able to train effectively and understanding spatial context in data. This paper introduces a novel NN architecture which hybridises the Long-Short-Term-Memory (LSTM) and Capsule Networks into a single network in a branched input Autoencoder architecture for use on multivariate time series data. The proposed method uses an unsupervised learning technique to overcome the issues with finding large volumes of labelled training data. Experimental results show that without hyperparameter optimisation, using Capsules significantly reduces overfitting and improves the training efficiency. Additionally, results also show that the branched input models can learn multivariate data more consistently with or without Capsules in comparison to the non-branched input models. The proposed model architecture was also tested on an open-source benchmark, where it achieved state-of-the-art performance in outlier detection, and overall performs best over the metrics tested in comparison to current state-of-the art methods.
Abstract:We provide a definition for class density that can be used to measure the aggregate similarity of the samples within each of the classes in a high-dimensional, unstructured dataset. We then put forth several candidate methods for calculating class density and analyze the correlation between the values each method produces with the corresponding individual class test accuracies achieved on a trained model. Additionally, we propose a definition for dataset quality for high-dimensional, unstructured data and show that those datasets that met a certain quality threshold (experimentally demonstrated to be > 10 for the datasets studied) were candidates for eliding redundant data based on the individual class densities.
Abstract:We show that, for each of five datasets of increasing complexity, certain training samples are more informative of class membership than others. These samples can be identified a priori to training by analyzing their position in reduced dimensional space relative to the classes' centroids. Specifically, we demonstrate that samples nearer the classes' centroids are less informative than those that are furthest from it. For all five datasets, we show that there is no statistically significant difference between training on the entire training set and when excluding up to 2% of the data nearest to each class's centroid.
Abstract:We present a dataset consisting of high-resolution images of 13 micro-PCBs captured in various rotations and perspectives relative to the camera, with each sample labeled for PCB type, rotation category, and perspective categories. We then present the design and results of experimentation on combinations of rotations and perspectives used during training and the resulting impact on test accuracy. We then show when and how well data augmentation techniques are capable of simulating rotations vs. perspectives not present in the training data. We perform all experiments using CNNs with and without homogeneous vector capsules (HVCs) and investigate and show the capsules' ability to better encode the equivariance of the sub-components of the micro-PCBs. The results of our experiments lead us to conclude that training a neural network equipped with HVCs, capable of modeling equivariance among sub-components, coupled with training on a diversity of perspectives, achieves the greatest classification accuracy on micro-PCB data.
Abstract:The multidimensional knapsack problem is a well-known constrained optimization problem with many real-world engineering applications. In order to solve this NP-hard problem, a new modified Imperialist Competitive Algorithm with Constrained Assimilation (ICAwICA) is presented. The proposed algorithm introduces the concept of colony independence, a free will to choose between classical ICA assimilation to empires imperialist or any other imperialist in the population. Furthermore, a constrained assimilation process has been implemented that combines classical ICA assimilation and revolution operators, while maintaining population diversity. This work investigates the performance of the proposed algorithm across 101 Multidimensional Knapsack Problem (MKP) benchmark instances. Experimental results show that the algorithm is able to obtain an optimal solution in all small instances and presents very competitive results for large MKP instances.
Abstract:This paper proposes an extension method for Ant Colony Optimization (ACO) algorithm called Dynamic Impact. Dynamic Impact is designed to solve challenging optimization problems that has nonlinear relationship between resource consumption and fitness in relation to other part of the optimized solution. This proposed method is tested against complex real-world Microchip Manufacturing Plant Production Floor Optimization (MMPPFO) problem, as well as theoretical benchmark Multi-Dimensional Knapsack problem (MKP). MMPPFO is a non-trivial optimization problem, due the nature of solution fitness value dependence on collection of wafer-lots without prioritization of any individual wafer-lot. Using Dynamic Impact on single objective optimization fitness value is improved by 33.2%. Furthermore, MKP benchmark instances of small complexity have been solved to 100% success rate where high degree of solution sparseness is observed, and large instances have showed average gap improved by 4.26 times. Algorithm implementation demonstrated superior performance across small and large datasets and sparse optimization problems.
Abstract:We present a convolutional neural network design with additional branches after certain convolutions so that we can extract features with differing effective receptive fields and levels of abstraction. From each branch, we transform each of the final filters into a pair of homogeneous vector capsules. As the capsules are formed from entire filters, we refer to them as filter capsules. We then compare three methods for merging the branches--merging with equal weight and merging with learned weights, with two different weight initialization strategies. This design, in combination with a domain-specific set of randomly applied augmentation techniques, establishes a new state of the art for the MNIST dataset with an accuracy of 99.84% for an ensemble of these models, as well as establishing a new state of the art for a single model (99.79% accurate). These accuracies were achieved with a 75% reduction in both the number of parameters and the number of epochs of training relative to the previously best performing capsule network on MNIST. All training was performed using the Adam optimizer and experienced no overfitting.
Abstract:Ant Colony algorithm has been applied to various optimization problems, however most of the previous work on scaling and parallelism focuses on Travelling Salesman Problems (TSPs). Although, useful for benchmarks and new idea comparison, the algorithmic dynamics does not always transfer to complex real-life problems, where additional meta-data is required during solution construction. This paper looks at real-life outbound supply chain problem using Ant Colony Optimization (ACO) and its scaling dynamics with two parallel ACO architectures - Independent Ant Colonies (IAC) and Parallel Ants (PA). Results showed that PA was able to reach a higher solution quality in fewer iterations as the number of parallel instances increased. Furthermore, speed performance was measured across three different hardware solutions - 16 core CPU, 68 core Xeon Phi and up to 4 Geforce GPUs. State of the art, ACO vectorization techniques such as SS-Roulette were implemented using C++ and CUDA. Although excellent for TSP, it was concluded that for the given supply chain problem GPUs are not suitable due to meta-data access footprint required. Furthermore, compared to their sequential counterpart, vectorized CPU AVX2 implementation achieved 25.4x speedup on CPU while Xeon Phi with its AVX512 instruction set reached 148x on PA with Vectorized (PAwV). PAwV is therefore able to scale at least up to 1024 parallel instances on the supply chain network problem solved.