Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Bernie Boscoe

Southern Oregon University

Using different sources of ground truths and transfer learning to improve the generalization of photometric redshift estimation

Nov 27, 2024

Jonathan Soriano, Srinath Saikrishnan, Vikram Seenivasan, Bernie Boscoe, Jack Singal, Tuan Do

Abstract:In this work, we explore methods to improve galaxy redshift predictions by combining different ground truths. Traditional machine learning models rely on training sets with known spectroscopic redshifts, which are precise but only represent a limited sample of galaxies. To make redshift models more generalizable to the broader galaxy population, we investigate transfer learning and directly combining ground truth redshifts derived from photometry and spectroscopy. We use the COSMOS2020 survey to create a dataset, TransferZ, which includes photometric redshift estimates derived from up to 35 imaging filters using template fitting. This dataset spans a wider range of galaxy types and colors compared to spectroscopic samples, though its redshift estimates are less accurate. We first train a base neural network on TransferZ and then refine it using transfer learning on a dataset of galaxies with more precise spectroscopic redshifts (GalaxiesML). In addition, we train a neural network on a combined dataset of TransferZ and GalaxiesML. Both methods reduce bias by $\sim$ 5x, RMS error by $\sim$ 1.5x, and catastrophic outlier rates by 1.3x on GalaxiesML, compared to a baseline trained only on TransferZ. However, we also find a reduction in performance for RMS and bias when evaluated on TransferZ data. Overall, our results demonstrate these approaches can meet cosmological requirements.

* 10 pages, 4 figures, 2 tables, accepted to NeurIPS 2024 Workshop ML4PS

Via

Access Paper or Ask Questions

Learning the Evolution of Physical Structure of Galaxies via Diffusion Models

Nov 27, 2024

Andrew Lizarraga, Eric Hanchen Jiang, Jacob Nowack, Yun Qi Li, Ying Nian Wu, Bernie Boscoe, Tuan Do

Figure 1 for Learning the Evolution of Physical Structure of Galaxies via Diffusion Models

Figure 2 for Learning the Evolution of Physical Structure of Galaxies via Diffusion Models

Figure 3 for Learning the Evolution of Physical Structure of Galaxies via Diffusion Models

Figure 4 for Learning the Evolution of Physical Structure of Galaxies via Diffusion Models

Abstract:In astrophysics, understanding the evolution of galaxies in primarily through imaging data is fundamental to comprehending the formation of the Universe. This paper introduces a novel approach to conditioning Denoising Diffusion Probabilistic Models (DDPM) on redshifts for generating galaxy images. We explore whether this advanced generative model can accurately capture the physical characteristics of galaxies based solely on their images and redshift measurements. Our findings demonstrate that this model not only produces visually realistic galaxy images but also encodes the underlying changes in physical properties with redshift that are the result of galaxy evolution. This approach marks a significant advancement in using generative models to enhance our scientific insight into cosmic phenomena.

Via

Access Paper or Ask Questions

Using Galaxy Evolution as Source of Physics-Based Ground Truth for Generative Models

Jul 09, 2024

Yun Qi Li, Tuan Do, Evan Jones, Bernie Boscoe, Kevin Alfaro, Zooey Nguyen

Figure 1 for Using Galaxy Evolution as Source of Physics-Based Ground Truth for Generative Models

Figure 2 for Using Galaxy Evolution as Source of Physics-Based Ground Truth for Generative Models

Figure 3 for Using Galaxy Evolution as Source of Physics-Based Ground Truth for Generative Models

Figure 4 for Using Galaxy Evolution as Source of Physics-Based Ground Truth for Generative Models

Abstract:Generative models producing images have enormous potential to advance discoveries across scientific fields and require metrics capable of quantifying the high dimensional output. We propose that astrophysics data, such as galaxy images, can test generative models with additional physics-motivated ground truths in addition to human judgment. For example, galaxies in the Universe form and change over billions of years, following physical laws and relationships that are both easy to characterize and difficult to encode in generative models. We build a conditional denoising diffusion probabilistic model (DDPM) and a conditional variational autoencoder (CVAE) and test their ability to generate realistic galaxies conditioned on their redshifts (galaxy ages). This is one of the first studies to probe these generative models using physically motivated metrics. We find that both models produce comparable realistic galaxies based on human evaluation, but our physics-based metrics are better able to discern the strengths and weaknesses of the generative models. Overall, the DDPM model performs better than the CVAE on the majority of the physics-based metrics. Ultimately, if we can show that generative models can learn the physics of galaxy evolution, they have the potential to unlock new astrophysical discoveries.

* 20 pages, 14 figures, 1 Table, code: https://github.com/astrodatalab/li2024_public, training data: https://zenodo.org/records/11117528

Via

Access Paper or Ask Questions

Elements of effective machine learning datasets in astronomy

Nov 29, 2022

Bernie Boscoe, Tuan Do, Evan Jones, Yunqi Li, Kevin Alfaro, Christy Ma

Figure 1 for Elements of effective machine learning datasets in astronomy

Abstract:In this work, we identify elements of effective machine learning datasets in astronomy and present suggestions for their design and creation. Machine learning has become an increasingly important tool for analyzing and understanding the large-scale flood of data in astronomy. To take advantage of these tools, datasets are required for training and testing. However, building machine learning datasets for astronomy can be challenging. Astronomical data is collected from instruments built to explore science questions in a traditional fashion rather than to conduct machine learning. Thus, it is often the case that raw data, or even downstream processed data is not in a form amenable to machine learning. We explore the construction of machine learning datasets and we ask: what elements define effective machine learning datasets? We define effective machine learning datasets in astronomy to be formed with well-defined data points, structure, and metadata. We discuss why these elements are important for astronomical applications and ways to put them in practice. We posit that these qualities not only make the data suitable for machine learning, they also help to foster usable, reusable, and replicable science practices.

* 5 pages, 1 figure, accepted to the peer-reviewed NeurIPS Machine Learning in the Physical Sciences Workshop, 2022

Via

Access Paper or Ask Questions