Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

O. Lahav

DES collaboration

A machine learning approach to galaxy properties: Joint redshift - stellar mass probability distributions with Random Forest

Dec 10, 2020

S. Mucesh, W. G. Hartley, A. Palmese, O. Lahav, L. Whiteway, A. Amon, K. Bechtol, G. M. Bernstein, A. Carnero Rosell, M. Carrasco Kind(+63 more)

Figure 1 for A machine learning approach to galaxy properties: Joint redshift - stellar mass probability distributions with Random Forest

Figure 2 for A machine learning approach to galaxy properties: Joint redshift - stellar mass probability distributions with Random Forest

Figure 3 for A machine learning approach to galaxy properties: Joint redshift - stellar mass probability distributions with Random Forest

Figure 4 for A machine learning approach to galaxy properties: Joint redshift - stellar mass probability distributions with Random Forest

Abstract:We demonstrate that highly accurate joint redshift - stellar mass PDFs can be obtained using the Random Forest (RF) machine learning (ML) algorithm, even with few photometric bands available. As an example, we use the Dark Energy Survey (DES), combined with the COSMOS2015 catalogue for redshifts and stellar masses. We build two ML models: one containing deep photometry in the $griz$ bands, and the second reflecting the photometric scatter present in the main DES survey, with carefully constructed representative training data in each case. We validate our joint PDFs for $10,699$ test galaxies by utilising the copula probability integral transform (copPIT) and the Kendall distribution function, and their univariate counterparts to validate the marginals. Benchmarked against a basic set-up of the template-fitting code BAGPIPES, our ML-based method outperforms template fitting on all of our pre-defined performance metrics. In addition to accuracy, the RF is extremely fast, able to compute joint PDFs for a million galaxies in just over $2$ hours with consumer computer hardware. Such speed enables PDFs to be derived in real-time within analysis codes, solving potential storage issues. As part of this work we have developed GALPRO, a highly intuitive and efficient Python package to rapidly generate multivariate PDFs on-the-fly. GALPRO is documented and available for researchers to use in their cosmology and galaxy evolution studies at https://galpro.readthedocs.io/.

* 18 pages, 8 figures, Submitted to MNRAS

Via

Access Paper or Ask Questions

Machine Learning for Searching the Dark Energy Survey for Trans-Neptunian Objects

Sep 27, 2020

B. Henghes, O. Lahav, D. W. Gerdes, E. Lin, R. Morgan, T. M. C. Abbott, M. Aguena, S. Allam, J. Annis, S. Avila(+49 more)

Figure 1 for Machine Learning for Searching the Dark Energy Survey for Trans-Neptunian Objects

Figure 2 for Machine Learning for Searching the Dark Energy Survey for Trans-Neptunian Objects

Figure 3 for Machine Learning for Searching the Dark Energy Survey for Trans-Neptunian Objects

Figure 4 for Machine Learning for Searching the Dark Energy Survey for Trans-Neptunian Objects

Abstract:In this paper we investigate how implementing machine learning could improve the efficiency of the search for Trans-Neptunian Objects (TNOs) within Dark Energy Survey (DES) data when used alongside orbit fitting. The discovery of multiple TNOs that appear to show a similarity in their orbital parameters has led to the suggestion that one or more undetected planets, an as yet undiscovered "Planet 9", may be present in the outer Solar System. DES is well placed to detect such a planet and has already been used to discover many other TNOs. Here, we perform tests on eight different supervised machine learning algorithms, using a dataset consisting of simulated TNOs buried within real DES noise data. We found that the best performing classifier was the Random Forest which, when optimised, performed well at detecting the rare objects. We achieve an area under the receiver operating characteristic (ROC) curve, (AUC) $= 0.996 \pm 0.001$. After optimizing the decision threshold of the Random Forest, we achieve a recall of 0.96 while maintaining a precision of 0.80. Finally, by using the optimized classifier to pre-select objects, we are able to run the orbit-fitting stage of our detection pipeline five times faster.

* Submitted to PASP, 15 pages, 5 figures

Via

Access Paper or Ask Questions