Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Viral Thakar

On Using Quasirandom Sequences in Machine Learning for Model Weight Initialization

Aug 05, 2024

Andriy Miranskyy, Adam Sorrenti, Viral Thakar

Abstract:The effectiveness of training neural networks directly impacts computational costs, resource allocation, and model development timelines in machine learning applications. An optimizer's ability to train the model adequately (in terms of trained model performance) depends on the model's initial weights. Model weight initialization schemes use pseudorandom number generators (PRNGs) as a source of randomness. We investigate whether substituting PRNGs for low-discrepancy quasirandom number generators (QRNGs) -- namely Sobol' sequences -- as a source of randomness for initializers can improve model performance. We examine Multi-Layer Perceptrons (MLP), Convolutional Neural Networks (CNN), Long Short-Term Memory (LSTM), and Transformer architectures trained on MNIST, CIFAR-10, and IMDB datasets using SGD and Adam optimizers. Our analysis uses ten initialization schemes: Glorot, He, Lecun (both Uniform and Normal); Orthogonal, Random Normal, Truncated Normal, and Random Uniform. Models with weights set using PRNG- and QRNG-based initializers are compared pairwise for each combination of dataset, architecture, optimizer, and initialization scheme. Our findings indicate that QRNG-based neural network initializers either reach a higher accuracy or achieve the same accuracy more quickly than PRNG-based initializers in 60% of the 120 experiments conducted. Thus, using QRNG-based initializers instead of PRNG-based initializers can speed up and improve model training.

Via

Access Paper or Ask Questions

Efficient Single-Shot Multibox Detector for Construction Site Monitoring

Aug 20, 2018

Viral Thakar, Himani Saini, Walid Ahmed, Mohammad M Soltani, Ahmed Aly, Jia Yuan Yu

Figure 1 for Efficient Single-Shot Multibox Detector for Construction Site Monitoring

Figure 2 for Efficient Single-Shot Multibox Detector for Construction Site Monitoring

Figure 3 for Efficient Single-Shot Multibox Detector for Construction Site Monitoring

Figure 4 for Efficient Single-Shot Multibox Detector for Construction Site Monitoring

Abstract:Asset monitoring in construction sites is an intricate, manually intensive task, that can highly benefit from automated solutions engineered using deep neural networks. We use Single-Shot Multibox Detector --- SSD, for its fine balance between speed and accuracy, to leverage ubiquitously available images and videos from the surveillance cameras on the construction sites and automate the monitoring tasks, hence enabling project managers to better track the performance and optimize the utilization of each resource. We propose to improve the performance of SSD by clustering the predicted boxes instead of a greedy approach like non-maximum suppression. We do so using Affinity Propagation Clustering --- APC to cluster the predicted boxes based on the similarity index computed using the spatial features as well as location of predicted boxes. In our attempts, we have been able to improve the mean average precision of SSD by 3.77% on custom dataset consist of images from construction sites and by 1.67% on PASCAL VOC Challenge.

* 6 pages, 4 figures, to appear in the Proceedings of the ISC2 2018, 16-19 September 2018, Kansas, USA

Via

Access Paper or Ask Questions

Ensemble-based Adaptive Single-shot Multi-box Detector

Aug 17, 2018

Viral Thakar, Walid Ahmed, Mohammad M Soltani, Jia Yuan Yu

Figure 1 for Ensemble-based Adaptive Single-shot Multi-box Detector

Figure 2 for Ensemble-based Adaptive Single-shot Multi-box Detector

Figure 3 for Ensemble-based Adaptive Single-shot Multi-box Detector

Figure 4 for Ensemble-based Adaptive Single-shot Multi-box Detector

Abstract:We propose two improvements to the SSD---single shot multibox detector. First, we propose an adaptive approach for default box selection in SSD. This uses data to reduce the uncertainty in the selection of best aspect ratios for the default boxes and improves performance of SSD for datasets containing small and complex objects (e.g., equipments at construction sites). We do so by finding the distribution of aspect ratios of the given training dataset, and then choosing representative values. Secondly, we propose an ensemble algorithm, using SSD as components, which improves the performance of SSD, especially for small amount of training datasets. Compared to the conventional SSD algorithm, adaptive box selection improves mean average precision by 3%, while ensemble-based SSD improves it by 8%.

* 6 pages, 2 figures, to appear in the Proceedings of the ISNCC 2018, 19-21 June 2018, Rome, Italy

Via

Access Paper or Ask Questions