Rebecca Ogungbure | Freelancer Portfolio Item #298736

Chapter 1: Introduction With our planet being just one out of the billion planets that make up our galaxy, together with billions of stars and the other galaxies that make up our cosmos, the vastness of it all may seem daunting. But now, with Machine learning, we finally have a chance to search through the soup that is our cosmos and make intelligible assumptions and/or conclusions from the big data that makes up our universe. Machine learning offers a broad range of algorithms and modeling tools that are used for a broad array of data processing. In recent times, it has been claimed that machine learning has greatly affected how cosmologists interpret large data (Ntampaka, Michelle, et al., 2019). In the next decade to come, to have the opportunity to make progressive sense from large cosmological data, we must ensure that our limiting factor does not become the statistical and data-driven tools and models we choose to adopt. Machine learning offers itself as a promising tool for the interpretation of cosmological data, thus, affording us a chance at achieving breakthrough records as we try to understand our rather complex universe. Seeing that we are at a critical period of precision cosmology, the large amount of data and precision theory have given us the opportunity to constraint the values of cosmological parameters with exceptional precision (Planck Collaboration et al., 2020) The aim of this research is to highlight some of the ways in which Machine learning models have become vital to the way cosmological data is collected, analyzed, and finally interpreted. This research work also aims to show how adopting various machine learning simulations can cause a positive effect on how large cosmological data sets are interpreted in the coming decade. At the end of this research work, I hope to interpret real-life data sets with my proposed machine learning model and give a detailed explanation of how machine learning models have evolved over the years to help in unlocking the complexity of our universe, in terms of moving us closer to precision cosmology and meaningful interpretation of large cosmological data. Chapter 2: Background 2.1 Machine Learning simulations and Cosmology The Universe is believed to be made up of three essential parts, namely; dark energy (that is responsible for the accelerated expansion of the universe), dark matter (that makes up the majority of the mass density of our universe), and ordinary visible matter (that includes stars, planets, etc). In the study of the Cosmos, dark matter plays a crucial role in the formation of galaxies. Galaxies clusters that form as a result have become one of the areas of application of machine learning. Another area of machine learning application is in the unknowns that are as a result of the particle nature of dark matter, which is the main source of expansion of our universe. Cosmology refers to the study of the universe, that is, its content and its evolution. Machine learning has proven to be useful in cosmic probes that include galaxy clustering, strong and weak gravitational lensing, supernovae, and cosmic microwave background. Machine learning models have become essential in categorizing and detecting cosmic sources. Machine Learning (ML) models are also used to extract information from images. Nowadays, we see machine learning algorithms being successfully employed in classification, clustering, regression, and/or dimensional reduction task of large sets of high-dimensional input data (Marsland, S. Machine Learning (CRC Press, Taylor & Francis Inc., Boca Raton, FL, 2014)). However, some common machine learning models that have been employed in cosmological studies include the supervised machine learning model, especially in studying the problems of galaxy formation and the evolution of semi-analytical models (SAM) (Kamdor, Turk & Brunner 2016). Machine Learning (ML) has allowed the inference of a few complex phenomena and has also provided a distinct and strong connection between the Dark Matter regime, that is large galaxy scales, and the baryonic regime (smaller galaxy scales) It is also known that SAM has previously been used to train some ML algorithms with considerable success (Breiman 2001; Geurts, Ernst & Wehenkel 2006). ML algorithms basically learn approximate relationships between input data and output data so that they can draw a useful inference. A supervised Machine Learning model is usually used for this. ML in recent times has been applied in Astronomy with a decent record of success (Ball & Brunner 2010; Ivezić et al. 2014). The areas of ML applications in some subfields in astronomy are as follows: classification problems such as star–galaxy classification (Ball et al. 2006; Kim et al. 2015), applications in regression such as photometric redshift estimation (Ball et al. 2007; Gerdes et al. 2010; Kind & Brunner 2013), galaxy morphology classification (Banerji et al. 2010; Dieleman et al. 2015), determining stellar labels from spectroscopic data (Ness et al. 2015), and estimation of stellar atmospheric parameters (Fiorentin et al. 2007). Machine Learning models, with their non-parametric nature and powerful predictive capabilities, provide an opportunity to study galaxy formation and evolution remarkably. For example, to study galaxy formation and to analyze the full extent of the influence of DM halos on galaxies in the backdrop of SAMs, some ML algorithms that can be employed are decision trees, random trees (RT), extremely random trees (ERT) and k-Nearest Neighbors (kNN). To then measure how well these algorithms are learning the relationship in the given data set, the Mean Square Error (MSE) is used, which is defined as follows: Source: Google {n} = number of data points Yi = observed values Yi = predicted values Machine learning provides a solid framework to study the halo-galaxy connection in the backdrop of SAMs. However, it is important to note that even though ML has shown progressive strides in halogalaxy probes, ML is not a replacement for SAMs. 2.2 Futuristic Prospects of Machine Learning Models in Cosmology Cosmology is rich in data. In the previous years, Machine Learning has greatly improved how this data is interpreted by cosmologists. Also, we find a large deposit of data in Astronomy, waiting to be discovered. Thus, to harness this big bank of data discovered in modern-day astronomy, efficient ML algorithms are required. As the future we anticipated comes into view with every year, there arises a need to be more strategic in the interpretation of big data to reach important hypotheses and accurate conclusions. Realizing the full potential of Machine Learning and how it affects the cosmos becomes of utmost importance. As we step into the past speculations and predictions of scientists about our universe, creating ML models and algorithms that actually work and can find usefulness in various fields is critical. The interpretability of ML is one area of progressive research that promises to increase the quality and diversity of interpretation models in data science in the next decade. On the other hand, Cosmology proves itself a challenge for ML researchers in that it brings new tasks and questions about how to employ ML models in interpreting data sets. These cosmological challenges serve as breakthrough opportunities in the basic understanding of ML. However, at the point where Cosmology meets ML lies a unique showcasing of the benefits that both fields provide. With the appearance of more data, both small and big, Astronomy, it seems, has stepped into the big data era. These data sets afford ML more opportunities for application in both cosmology and astronomy. One of such opportunities is the big data availability to LSST (LSST Science Collaboration et al.,2009). The continuous development and future implementation of carefully designed and selected ML algorithms at both the image processing (Goulding et al., 2018; Dai & Tong, 2018; Ack-ermann et al., 2018) and catalog (Narayan et al., 2018; Malz et al., 2018) levels have the potential of producing meaningful advances in our ability to efficiently extract scientifically useful information, for example, classification, distance, morphology, and mass, from the LSST data. Another area of application of ML is Supernovae Cosmology. Machine learning simulations provide a means for a more accurate supernova classification which is critical in analyzing massive public supernova data sets. As astronomical data sets are becoming larger and more difficult to process, ML has become increasingly popular (Ball & Brunner 2010; Bloom & Richards 2012). It is known that only Type 1a supernova is used for cosmology. However, Supernova cosmology is now possible without knowing the supernova type being used. An example is using Bayesian methods (Kunz et al. 2007; Hlozek et al. 2012; Newling et al. 2012; Knights et al. 2013; Rubin et al. 2015). Thus far, ML has made great strides in explaining the complexity of the cosmos and even greater successes have been recorded in different researches cutting across different fields. There is still, however, a long way to go in achieving outstanding results in the field of cosmology using ML in the next decade to come. The successes recorded in Machine Learning in Cosmology hint at the great potential that ML is offering for data discovery and interpretation, especially as the data sets become bigger and the complexity increases. References: 1. 2. 3. 4. 5. Ball & Brunner 2010; Bloom & Richards 2012 Ball et al. 2007; Gerdes et al. 2010; Kind & Brunner 2013 Ball & Brunner 2010; Ivezić et al. 2014 Ball et al. 2006; Kim et al. 2015 Banerji et al. 2010; Dieleman et al. 2015 6. Breiman 2001; Geurts, Ernst & Wehenkel 2006 7. Fiorentin et al. 2007 8. Goulding et al., 2018; Dai & Tong, 2018; Ack-ermann et al., 2018) and catalog (Narayan et al., 2018; Malz et al., 2018 9. Kamdor, Turk & Brunner 2016 10. Kunz et al. 2007; Hlozek et al. 2012; Newling et al. 2012; Knights et al. 2013; Rubin et al. 2015) 11. LSST Science Collaboration et al.,2009 12. Marsland, S. Machine Learning (CRC Press, Taylor & Francis Inc., Boca Raton, FL, 2014 13. Ness et al. 2015 14. Narayan et al., 2018; Malz et al., 2018 15. Ntampaka, Michelle, et al., 2019 16. Planck Collaboration et al., 2020