Abstract:This work explores the theoretical and practical foundations of denoising diffusion probabilistic models (DDPMs) and score-based generative models, which leverage stochastic processes and Brownian motion to model complex data distributions. These models employ forward and reverse diffusion processes defined through stochastic differential equations (SDEs) to iteratively add and remove noise, enabling high-quality data generation. By analyzing the performance bounds of these models, we demonstrate how score estimation errors propagate through the reverse process and bound the total variation distance using discrete Girsanov transformations, Pinsker's inequality, and the data processing inequality (DPI) for an information theoretic lens.
Abstract:Conventional calibration of Baryon Acoustic Oscillations (BAO) data relies on estimation of the sound horizon at drag epoch $r_d$ from early universe observations by assuming a cosmological model. We present a recalibration of two independent BAO datasets, SDSS and DESI, by employing deep learning techniques for model-independent estimation of $r_d$, and explore the impacts on $\Lambda$CDM cosmological parameters. Significant reductions in both Hubble ($H_0$) and clustering ($S_8$) tensions are observed for both the recalibrated datasets. Moderate shifts in some other parameters hint towards further exploration of such data-driven approaches.
Abstract:This research compares PDF parsing and Optical Character Recognition (OCR) methods for extracting Nepali content from PDFs. PDF parsing offers fast and accurate extraction but faces challenges with non-Unicode Nepali fonts. OCR, specifically PyTesseract, overcomes these challenges, providing versatility for both digital and scanned PDFs. The study reveals that while PDF parsers are faster, their accuracy fluctuates based on PDF types. In contrast, OCRs, with a focus on PyTesseract, demonstrate consistent accuracy at the expense of slightly longer extraction times. Considering the project's emphasis on Nepali PDFs, PyTesseract emerges as the most suitable library, balancing extraction speed and accuracy.
Abstract:We investigate the prospect of reconstructing the ``cosmic distance ladder'' of the Universe using a novel deep learning framework called LADDER - Learning Algorithm for Deep Distance Estimation and Reconstruction. LADDER is trained on the apparent magnitude data from the Pantheon Type Ia supernovae compilation, incorporating the full covariance information among data points, to produce predictions along with corresponding errors. After employing several validation tests with a number of deep learning models, we pick LADDER as the best performing one. We then demonstrate applications of our method in the cosmological context, that include serving as a model-independent tool for consistency checks for other datasets like baryon acoustic oscillations, calibration of high-redshift datasets such as gamma ray bursts, use as a model-independent mock catalog generator for future probes, etc. Our analysis advocates for interesting yet cautious consideration of machine learning applications in these contexts.
Abstract:In today's rapidly evolving educational landscape, traditional modes of passive information delivery are giving way to transformative pedagogical approaches that prioritize active student engagement. Within the context of large-scale hybrid classrooms, the challenge lies in fostering meaningful and active interaction between students and course content. This study delves into the significance of measuring students' earnestness during interactive lecture participation exercises. By analyzing students' responses to interactive lecture poll questions, establishing a clear rubric for evaluating earnestness, and conducting a comprehensive assessment, we introduce EIT (Earnest Insight Toolkit), a tool designed to assess students' engagement within interactive lecture participation exercises - particularly in the context of large-scale hybrid classrooms. Through the utilization of EIT, our objective is to equip educators with valuable means of identifying at-risk students for enhancing intervention and support strategies, as well as measuring students' levels of engagement with course content.
Abstract:We study the prospects of Machine Learning algorithms like Gaussian processes (GP) as a tool to reconstruct the Hubble parameter $H(z)$ with two upcoming gravitational wave missions, namely the evolved Laser Interferometer Space Antenna (eLISA) and the Einstein Telescope (ET). We perform non-parametric reconstructions of $H(z)$ with GP using realistically generated catalogues, assuming various background cosmological models, for each mission. We also take into account the effect of early-time and late-time priors separately on the reconstruction, and hence on the Hubble constant ($H_0$). Our analysis reveals that GPs are quite robust in reconstructing the expansion history of the Universe within the observational window of the specific mission under study. We further confirm that both eLISA and ET would be able to constrain $H(z)$ and $H_0$ to a much higher precision than possible today, and also find out their possible role in addressing the Hubble tension for each model, on a case-by-case basis.