Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Peter Melchior

Princeton

The optical and infrared are connected

Mar 05, 2025

Christian K. Jespersen, Peter Melchior, David N. Spergel, Andy D. Goulding, ChangHoon Hahn, Kartheik G. Iyer

Abstract:Galaxies are often modelled as composites of separable components with distinct spectral signatures, implying that different wavelength ranges are only weakly correlated. They are not. We present a data-driven model which exploits subtle correlations between physical processes to accurately predict infrared (IR) WISE photometry from a neural summary of optical SDSS spectra. The model achieves accuracies of $\chi^2_N \approx 1$ for all photometric bands in WISE, as well as good colors. We are also able to tightly constrain typically IR-derived properties, e.g. the bolometric luminosities of AGN and dust parameters such as $\mathrm{q_{PAH}}$. We find that current SED-fitting methods are incapable of making comparable predictions, and that model misspecification often leads to correlated biases in star-formation rates and AGN luminosities. To help improve SED models, we determine what features of the optical spectrum are responsible for our improved predictions, and identify several lines (CaII, SrII, FeI, [OII] and H$\alpha$), which point to the complex chronology of star formation and chemical enrichment being incorrectly modelled.

* 17 pages, 14 figures. 12 pages of Appendix. Submitted to ApJ

Via

Access Paper or Ask Questions

Path-minimizing Latent ODEs for improved extrapolation and inference

Oct 11, 2024

Matt L. Sampson, Peter Melchior

Abstract:Latent ODE models provide flexible descriptions of dynamic systems, but they can struggle with extrapolation and predicting complicated non-linear dynamics. The latent ODE approach implicitly relies on encoders to identify unknown system parameters and initial conditions, whereas the evaluation times are known and directly provided to the ODE solver. This dichotomy can be exploited by encouraging time-independent latent representations. By replacing the common variational penalty in latent space with an $\ell_2$ penalty on the path length of each system, the models learn data representations that can easily be distinguished from those of systems with different configurations. This results in faster training, smaller models, more accurate interpolation and long-time extrapolation compared to the baseline ODE models with GRU, RNN, and LSTM encoder/decoders on tests with damped harmonic oscillator, self-gravitating fluid, and predator-prey systems. We also demonstrate superior results for simulation-based inference of the Lotka-Volterra parameters and initial conditions by using the latents as data summaries for a conditional normalizing flow. Our change to the training loss is agnostic to the specific recognition network used by the decoder and can therefore easily be adopted by other latent ODE models.

* 20 pages 11 figures

Via

Access Paper or Ask Questions

Multiscale Feature Attribution for Outliers

Oct 30, 2023

Jeff Shen, Peter Melchior

Abstract:Machine learning techniques can automatically identify outliers in massive datasets, much faster and more reproducible than human inspection ever could. But finding such outliers immediately leads to the question: which features render this input anomalous? We propose a new feature attribution method, Inverse Multiscale Occlusion, that is specifically designed for outliers, for which we have little knowledge of the type of features we want to identify and expect that the model performance is questionable because anomalous test data likely exceed the limits of the training data. We demonstrate our method on outliers detected in galaxy spectra from the Dark Energy Survey Instrument and find its results to be much more interpretable than alternative attribution approaches.

* 6 pages, 2 figures, accepted to NeurIPS 2023 Workshop on Machine Learning and the Physical Sciences. Code available at https://github.com/al-jshen/imo

Via

Access Paper or Ask Questions

$\texttt{Mangrove}$: Learning Galaxy Properties from Merger Trees

Oct 24, 2022

Christian Kragh Jespersen, Miles Cranmer, Peter Melchior, Shirley Ho, Rachel S. Somerville, Austen Gabrielpillai

Abstract:Efficiently mapping baryonic properties onto dark matter is a major challenge in astrophysics. Although semi-analytic models (SAMs) and hydrodynamical simulations have made impressive advances in reproducing galaxy observables across cosmologically significant volumes, these methods still require significant computation times, representing a barrier to many applications. Graph Neural Networks (GNNs) have recently proven to be the natural choice for learning physical relations. Among the most inherently graph-like structures found in astrophysics are the dark matter merger trees that encode the evolution of dark matter halos. In this paper we introduce a new, graph-based emulator framework, $\texttt{Mangrove}$, and show that it emulates the galactic stellar mass, cold gas mass and metallicity, instantaneous and time-averaged star formation rate, and black hole mass -- as predicted by a SAM -- with root mean squared error up to two times lower than other methods across a $(75 Mpc/h)^3$ simulation box in 40 seconds, 4 orders of magnitude faster than the SAM. We show that $\texttt{Mangrove}$ allows for quantification of the dependence of galaxy properties on merger history. We compare our results to the current state of the art in the field and show significant improvements for all target properties. $\texttt{Mangrove}$ is publicly available.

* 15 pages, 9 figures, 3 tables, 10 pages of Appendices. Accepted for publication in ApJ

Via

Access Paper or Ask Questions

Accelerated Bayesian SED Modeling using Amortized Neural Posterior Estimation

Mar 14, 2022

ChangHoon Hahn, Peter Melchior

Figure 1 for Accelerated Bayesian SED Modeling using Amortized Neural Posterior Estimation

Figure 2 for Accelerated Bayesian SED Modeling using Amortized Neural Posterior Estimation

Figure 3 for Accelerated Bayesian SED Modeling using Amortized Neural Posterior Estimation

Figure 4 for Accelerated Bayesian SED Modeling using Amortized Neural Posterior Estimation

Abstract:State-of-the-art spectral energy distribution (SED) analyses use a Bayesian framework to infer the physical properties of galaxies from observed photometry or spectra. They require sampling from a high-dimensional space of SED model parameters and take $>10-100$ CPU hours per galaxy, which renders them practically infeasible for analyzing the $billions$ of galaxies that will be observed by upcoming galaxy surveys ($e.g.$ DESI, PFS, Rubin, Webb, and Roman). In this work, we present an alternative scalable approach to rigorous Bayesian inference using Amortized Neural Posterior Estimation (ANPE). ANPE is a simulation-based inference method that employs neural networks to estimate the posterior probability distribution over the full range of observations. Once trained, it requires no additional model evaluations to estimate the posterior. We present, and publicly release, ${\rm SED}{flow}$, an ANPE method to produce posteriors of the recent Hahn et al. (2022) SED model from optical photometry. ${\rm SED}{flow}$ takes ${\sim}1$ $second~per~galaxy$ to obtain the posterior distributions of 12 model parameters, all of which are in excellent agreement with traditional Markov Chain Monte Carlo sampling results. We also apply ${\rm SED}{flow}$ to 33,884 galaxies in the NASA-Sloan Atlas and publicly release their posteriors: see https://changhoonhahn.github.io/SEDflow.

* 21 pages, 5 figures; submitted to ApJ; code available at https://changhoonhahn.github.io/SEDflow

Via

Access Paper or Ask Questions

Graph Neural Network-based Resource Allocation Strategies for Multi-Object Spectroscopy

Sep 29, 2021

Tianshu Wang, Peter Melchior

Figure 1 for Graph Neural Network-based Resource Allocation Strategies for Multi-Object Spectroscopy

Figure 2 for Graph Neural Network-based Resource Allocation Strategies for Multi-Object Spectroscopy

Figure 3 for Graph Neural Network-based Resource Allocation Strategies for Multi-Object Spectroscopy

Figure 4 for Graph Neural Network-based Resource Allocation Strategies for Multi-Object Spectroscopy

Abstract:Resource allocation problems are often approached with linear programming techniques. But many concrete allocation problems in the experimental and observational sciences cannot or should not be expressed in the form of linear objective functions. Even if the objective is linear, its parameters may not be known beforehand because they depend on the results of the experiment for which the allocation is to be determined. To address these challenges, we present a bipartite Graph Neural Network architecture for trainable resource allocation strategies. Items of value and constraints form the two sets of graph nodes, which are connected by edges corresponding to possible allocations. The GNN is trained on simulations or past problem occurrences to maximize any user-supplied, scientifically motivated objective function, augmented by an infeasibility penalty. The amount of feasibility violation can be tuned in relation to any available slack in the system. We apply this method to optimize the astronomical target selection strategy for the highly multiplexed Subaru Prime Focus Spectrograph instrument, where it shows superior results to direct gradient descent optimization and extends the capabilities of the currently employed solver which uses linear objective functions. The development of this method enables fast adjustment and deployment of allocation strategies, statistical analyses of allocation patterns, and fully differentiable, science-driven solutions for resource allocation problems.

* The GNN code used in this paper is available at https://github.com/tianshu-wang/PFS-GNN-bipartite

Via

Access Paper or Ask Questions

Unsupervised Resource Allocation with Graph Neural Networks

Jun 17, 2021

Miles Cranmer, Peter Melchior, Brian Nord

Figure 1 for Unsupervised Resource Allocation with Graph Neural Networks

Figure 2 for Unsupervised Resource Allocation with Graph Neural Networks

Figure 3 for Unsupervised Resource Allocation with Graph Neural Networks

Figure 4 for Unsupervised Resource Allocation with Graph Neural Networks

Abstract:We present an approach for maximizing a global utility function by learning how to allocate resources in an unsupervised way. We expect interactions between allocation targets to be important and therefore propose to learn the reward structure for near-optimal allocation policies with a GNN. By relaxing the resource constraint, we can employ gradient-based optimization in contrast to more standard evolutionary algorithms. Our algorithm is motivated by a problem in modern astronomy, where one needs to select-based on limited initial information-among $10^9$ galaxies those whose detailed measurement will lead to optimal inference of the composition of the universe. Our technique presents a way of flexibly learning an allocation strategy by only requiring forward simulators for the physics of interest and the measurement process. We anticipate that our technique will also find applications in a range of resource allocation problems.

* Accepted to PMLR/contributed oral at NeurIPS 2020 Pre-registration Workshop. Code at https://github.com/MilesCranmer/gnn_resource_allocation

Via

Access Paper or Ask Questions

$\texttt{deep21}$: a Deep Learning Method for 21cm Foreground Removal

Oct 29, 2020

T. Lucas Makinen, Lachlan Lancaster, Francisco Villaescusa-Navarro, Peter Melchior, Shirley Ho, Laurence Perreault-Levasseur, David N. Spergel

$Figure 1 for $\texttt{deep21}$: a Deep Learning Method for 21cm Foreground Removal$

$Figure 2 for $\texttt{deep21}$: a Deep Learning Method for 21cm Foreground Removal$

$Figure 3 for $\texttt{deep21}$: a Deep Learning Method for 21cm Foreground Removal$

$Figure 4 for $\texttt{deep21}$: a Deep Learning Method for 21cm Foreground Removal$

Abstract:We seek to remove foreground contaminants from 21cm intensity mapping observations. We demonstrate that a deep convolutional neural network (CNN) with a UNet architecture and three-dimensional convolutions, trained on simulated observations, can effectively separate frequency and spatial patterns of the cosmic neutral hydrogen (HI) signal from foregrounds in the presence of noise. Cleaned maps recover cosmological clustering statistics within 10% at all relevant angular scales and frequencies. This amounts to a reduction in prediction variance of over an order of magnitude on small angular scales ($\ell > 300$), and improved accuracy for small radial scales ($k_{\parallel} > 0.17\ \rm h\ Mpc^{-1})$ compared to standard Principal Component Analysis (PCA) methods. We estimate posterior confidence intervals for the network's prediction by training an ensemble of UNets. Our approach demonstrates the feasibility of analyzing 21cm intensity maps, as opposed to derived summary statistics, for upcoming radio experiments, as long as the simulated foreground model is sufficiently realistic. We provide the code used for this analysis on $\href{https://github.com/tlmakinen/deep21}{\rm GitHub}$, as well as a browser-based tutorial for the experiment and UNet model via the accompanying $\href{http://bit.ly/deep21-colab}{\rm Colab\ notebook}$.

* To be submitted to JCAP. 28 pages, 11 figures

Via

Access Paper or Ask Questions

Hybrid Physical-Deep Learning Model for Astronomical Inverse Problems

Dec 09, 2019

Francois Lanusse, Peter Melchior, Fred Moolekamp

Figure 1 for Hybrid Physical-Deep Learning Model for Astronomical Inverse Problems

Figure 2 for Hybrid Physical-Deep Learning Model for Astronomical Inverse Problems

Figure 3 for Hybrid Physical-Deep Learning Model for Astronomical Inverse Problems

Figure 4 for Hybrid Physical-Deep Learning Model for Astronomical Inverse Problems

Abstract:We present a Bayesian machine learning architecture that combines a physically motivated parametrization and an analytic error model for the likelihood with a deep generative model providing a powerful data-driven prior for complex signals. This combination yields an interpretable and differentiable generative model, allows the incorporation of prior knowledge, and can be utilized for observations with different data quality without having to retrain the deep network. We demonstrate our approach with an example of astronomical source separation in current imaging data, yielding a physical and interpretable model of astronomical scenes.

* 8 pages, accepted submission to the NeurIPS 2019 Machine Learning and the Physical Sciences Workshop

Via

Access Paper or Ask Questions

Block-Simultaneous Direction Method of Multipliers: A proximal primal-dual splitting algorithm for nonconvex problems with multiple constraints

Aug 30, 2017

Fred Moolekamp, Peter Melchior

Figure 1 for Block-Simultaneous Direction Method of Multipliers: A proximal primal-dual splitting algorithm for nonconvex problems with multiple constraints

Figure 2 for Block-Simultaneous Direction Method of Multipliers: A proximal primal-dual splitting algorithm for nonconvex problems with multiple constraints

Figure 3 for Block-Simultaneous Direction Method of Multipliers: A proximal primal-dual splitting algorithm for nonconvex problems with multiple constraints

Figure 4 for Block-Simultaneous Direction Method of Multipliers: A proximal primal-dual splitting algorithm for nonconvex problems with multiple constraints

Abstract:We introduce a generalization of the linearized Alternating Direction Method of Multipliers to optimize a real-valued function $f$ of multiple arguments with potentially multiple constraints $g_\circ$ on each of them. The function $f$ may be nonconvex as long as it is convex in every argument, while the constraints $g_\circ$ need to be convex but not smooth. If $f$ is smooth, the proposed Block-Simultaneous Direction Method of Multipliers (bSDMM) can be interpreted as a proximal analog to inexact coordinate descent methods under constraints. Unlike alternative approaches for joint solvers of multiple-constraint problems, we do not require linear operators $L$ of a constraint function $g(L\ \cdot)$ to be invertible or linked between each other. bSDMM is well-suited for a range of optimization problems, in particular for data analysis, where $f$ is the likelihood function of a model and $L$ could be a transformation matrix describing e.g. finite differences or basis transforms. We apply bSDMM to the Non-negative Matrix Factorization task of a hyperspectral unmixing problem and demonstrate convergence and effectiveness of multiple constraints on both matrix factors. The algorithms are implemented in python and released as an open-source package.

* 13 pages, 4 figures

Via

Access Paper or Ask Questions