Galaxy morphological classification in deep-wide surveys via unsupervised machine learning
Read, Shaun C.
Geach, James E.
Galaxy morphology is a fundamental quantity, that is essential not only for the full spectrum of galaxy-evolution studies, but also for a plethora of science in observational cosmology. While a rich literature exists on morphological-classification techniques, the unprecedented data volumes, coupled, in some cases, with the short cadences of forthcoming 'Big-Data' surveys (e.g. from the LSST), present novel challenges for this field. Large data volumes make such datasets intractable for visual inspection (even via massively-distributed platforms like Galaxy Zoo), while short cadences make it difficult to employ techniques like supervised machine-learning, since it may be impractical to repeatedly produce training sets on short timescales. Unsupervised machine learning, which does not require training sets, is ideally suited to the morphological analysis of new and forthcoming surveys. Here, we employ an algorithm that performs clustering of graph representations, in order to group image patches with similar visual properties and objects constructed from those patches, like galaxies. We implement the algorithm on the Hyper-Suprime-Cam Subaru-Strategic-Program Ultra-Deep survey, to autonomously reduce the galaxy population to a small number (160) of 'morphological clusters', populated by galaxies with similar morphologies, which are then benchmarked using visual inspection. The morphological classifications (which we release publicly) exhibit a high level of purity, and reproduce known trends in key galaxy properties as a function of morphological type at z