Original images of a spiral galaxy and an elliptical galaxy and their degraded versions, used to train the neural network.
A scientific research led by the Department of Physics and Astronomy from the University of Pennsylvania and the Institute of Physics of Cantabria (UC-CSIC) has produced the largest catalogue of galaxy morphological classification to date, including 27 million galaxies. The researcher at the Institute of Space Sciences (ICE-CSIC) Helena Domínguez is the second author of this research, recently published in the Monthly Notices of the Royal Astronomical Society (MNRAS) magazine.
The researchers used data from the Dark Energy Survey (DES) –a dataset cataloguing hundreds of millions distant galaxies over six years– and artificial intelligence, specifically, a machine learning algorithm with up to 97% accuracy to learn how to separate galaxies in morphological types, even faint and distant galaxies.
The morphology of galaxies is closely related to the kind of stars they are built of and their formation mechanisms. This catalogue includes two main morphological types: spiral galaxies, with a rotating disk where new stars are born; and elliptical galaxies, the most massive galaxies in the Universe, composed of old stars and dominated by random motions.
It is easy to distinguish these two galaxy types at a glance, but there are two important problems: on the one hand, the huge number of galaxies to be classified compels to use automated classifications, and on the other hand, the fact that distant galaxies look fainter and smaller, which usually made images very noisy.
The scientific team degraded high-quality images of local galaxies to the appearance they would have if they were more distant, and used the correct labels to train a convolutional neural network. In this way, it has been possible to learn to classify even the most difficult examples. According to the study, the algorithm guessing the galaxies morphology is correct 97% of the times, regardless of the of noise and the spatial resolution of the images.
This study proves that machines are able to recover features which remain hidden to the human eye and that they are able to separate useful signals from noise when trained with the correct labels. Therefore, machines can reliably classify images of fainter galaxies.
The use of convolutional neural networks (CNN) has proven to be extremely successful for analysing and classifying galaxy images. This type of neural network is a deep learning algorithm able to take in an input image and assign a label to different features of that image to distinguish them from each other.
This automated method has made it possible to assign a classification to 27 million galaxies and produce the largest morphological galaxy catalogue published to date.
Some of the galaxies included in the catalogue are as far away as 8 gigayears (Gyr), meaning 8 billion years. This catalogue allows to have an approximate picture of how the galaxies looked like when the Universe was half the age it is today, to study the changes in the shape of galaxies in the last 8 Gyr and how these structural changes are linked to the evolutionary paths of galaxies.
The fact that machines can learn to recognize patterns in noisy data can have direct applications in other fields, such as security (i.e. facial recognition), industrial image recognition, clinical diagnosis or climate change.
This research is presented in the paper “Pushing automated morphological classifications to their limits with the Dark Energy Survey”, published in the magazine Monthly Notices of the Royal Astronomical Society (MNRAS).