Abstract:Despite an extensive body of literature on deep learning optimization, our current understanding of what makes an optimization algorithm effective is fragmented. In particular, we do not understand well whether enhanced optimization translates to improved generalizability. Current research overlooks the inherent stochastic nature of stochastic gradient descent (SGD) and its variants, resulting in a lack of comprehensive benchmarking and insight into their statistical performance. This paper aims to address this gap by adopting a novel approach. Rather than solely evaluating the endpoint of individual optimization trajectories, we draw from an ensemble of trajectories to estimate the stationary distribution of stochastic optimizers. Our investigation encompasses a wide array of techniques, including SGD and its variants, flat-minima optimizers, and new algorithms we propose under the Basin Hopping framework. Through our evaluation, which encompasses synthetic functions with known minima and real-world problems in computer vision and natural language processing, we emphasize fair benchmarking under a statistical framework, comparing stationary distributions and establishing statistical significance. Our study uncovers several key findings regarding the relationship between training loss and hold-out accuracy, as well as the comparable performance of SGD, noise-enabled variants, and novel optimizers utilizing the BH framework. Notably, these algorithms demonstrate performance on par with flat-minima optimizers like SAM, albeit with half the gradient evaluations. We anticipate that our work will catalyze further exploration in deep learning optimization, encouraging a shift away from single-model approaches towards methodologies that acknowledge and leverage the stochastic nature of optimizers.
Abstract:Artificial intelligence (AI) and machine learning (ML) have made a paradigm shift in health care which, eventually can be used for decision support and forecasting by exploring the medical data. Recent studies showed that AI and ML can be used to fight against the COVID-19 pandemic. Therefore, the objective of this review study is to summarize the recent AI and ML based studies that have focused to fight against COVID-19 pandemic. From an initial set of 634 articles, a total of 35 articles were finally selected through an extensive inclusion-exclusion process. In our review, we have explored the objectives/aims of the existing studies (i.e., the role of AI/ML in fighting COVID-19 pandemic); context of the study (i.e., study focused to a specific country-context or with a global perspective); type and volume of dataset; methodology, algorithms or techniques adopted in the prediction or diagnosis processes; and mapping the algorithms/techniques with the data type highlighting their prediction/classification accuracy. We particularly focused on the uses of AI/ML in analyzing the pandemic data in order to depict the most recent progress of AI for fighting against COVID-19 and pointed out the potential scope of further research.