Abstract:Systems of polynomial equations arise frequently in computer vision, especially in multiview geometry problems. Traditional methods for solving these systems typically aim to eliminate variables to reach a univariate polynomial, e.g., a tenth-order polynomial for 5-point pose estimation, using clever manipulations, or more generally using Grobner basis, resultants, and elimination templates, leading to successful algorithms for multiview geometry and other problems. However, these methods do not work when the problem is complex and when they do, they face efficiency and stability issues. Homotopy Continuation (HC) can solve more complex problems without the stability issues, and with guarantees of a global solution, but they are known to be slow. In this paper we show that HC can be parallelized on a GPU, showing significant speedups up to 26 times on polynomial benchmarks. We also show that GPU-HC can be generically applied to a range of computer vision problems, including 4-view triangulation and trifocal pose estimation with unknown focal length, which cannot be solved with elimination template but they can be efficiently solved with HC. GPU-HC opens the door to easy formulation and solution of a range of computer vision problems.
Abstract:This paper presents some of the current challenges in designing deep learning artificial intelligence (AI) and integrating it with traditional high-performance computing (HPC) simulations. We evaluate existing packages for their ability to run deep learning models and applications on large-scale HPC systems efficiently, identify challenges, and propose new asynchronous parallelization and optimization techniques for current large-scale heterogeneous systems and upcoming exascale systems. These developments, along with existing HPC AI software capabilities, have been integrated into MagmaDNN, an open-source HPC deep learning framework. Many deep learning frameworks are targeted at data scientists and fall short in providing quality integration into existing HPC workflows. This paper discusses the necessities of an HPC deep learning framework and how those needs can be provided (e.g., as in MagmaDNN) through a deep integration with existing HPC libraries, such as MAGMA and its modular memory management, MPI, CuBLAS, CuDNN, MKL, and HIP. Advancements are also illustrated through the use of algorithmic enhancements in reduced- and mixed-precision, as well as asynchronous optimization methods. Finally, we present illustrations and potential solutions for enhancing traditional compute- and data-intensive applications at ORNL and UTK with AI. The approaches and future challenges are illustrated in materials science, imaging, and climate applications.
Abstract:Heart disease is the leading cause of death, and experts estimate that approximately half of all heart attacks and strokes occur in people who have not been flagged as "at risk." Thus, there is an urgent need to improve the accuracy of heart disease diagnosis. To this end, we investigate the potential of using data analysis, and in particular the design and use of deep neural networks (DNNs) for detecting heart disease based on routine clinical data. Our main contribution is the design, evaluation, and optimization of DNN architectures of increasing depth for heart disease diagnosis. This work led to the discovery of a novel five layer DNN architecture - named Heart Evaluation for Algorithmic Risk-reduction and Optimization Five (HEARO-5) -- that yields best prediction accuracy. HEARO-5's design employs regularization optimization and automatically deals with missing data and/or data outliers. To evaluate and tune the architectures we use k-way cross-validation as well as Matthews correlation coefficient (MCC) to measure the quality of our classifications. The study is performed on the publicly available Cleveland dataset of medical information, and we are making our developments open source, to further facilitate openness and research on the use of DNNs in medicine. The HEARO-5 architecture, yielding 99% accuracy and 0.98 MCC, significantly outperforms currently published research in the area.