Exploring the Infrared Variable Sky with Machine Learning
Abstract
The last decade in astronomy has seen the growth of time-series data and with it, the emergence
of large surveys. Surveys such as PTF (Law et al., 2009), ZTF (Bellm et al., 2019),
CoRoT (Auvergne et al., 2009), HOYS (Froebrich et al., 2018) and VVV(Minniti et al., 2010)
provide large amounts of large-area, multi-epoch data. Such surveys bring a multitude of new
issues, many of which are in the form of ‘unknown-unknowns’. From this, novel techniques are
required to properly analyse these data. Manual analysis is unfeasible and hence, efforts have
been taken to develop tools that seek to automate large portions of the data analysis. The new
dimension of study afforded to us by these surveys allows us to probe the formation, evolution
and death of stars in unique ways.
A fundamental issue arises “How can we completely and robustly extract information from
modern astronomical time series data?” – Answering this question requires the development
of novel methods and the improvement of those already established. In doing so, I aim to
further expand and explain the demographics of variable stars in the Milky Way. By coupling
more sensitive and robust identification methods with more thorough and complete analysis, I
aim to identify and characterise new and known stellar classes. These actions seek to provide a
more complete and accurate view of the Milky Way, its structure and demographics.
Key contributions of this thesis include the development of a neural network-based false alarm
probability (NN FAP) method, which significantly improves the identification of periodic variables
in large-scale surveys like VVV, LSST, and TESS. This method generates a universally
comparable and unbiased FAP, making it applicable across various types of variable stars, leading
to a more complete view of the demographics of periodic variable stars. The creation of the
PeRiodic Infrared Milky-way VVV Star-catalogue (PRIMVS) underscores the effort to identify
periodic variable stars comprehensively and without bias. Utilising the VVV survey’s depth
and breadth, PRIMVS processed over 86 million candidate variable sources using multiple
period-finding methods and a novel neural network-based false alarm probability, leading to
the identification of approximately 5 million periodic variables. Moreover, the thesis introduces
a contrastive learning approach based on the SimCLR framework with a gated recurrent neural
network (GRU) backbone, specifically designed to handle stochastically sampled time-series
data. This method improves variable star classification by creating semantically meaningful embeddings,
enabling more nuanced and accurate analysis. Additionally, the integration of VVV
data with Gaia astrometry enhances distance measurements to star forming regions, while the
use of Denoising Diffusion Probabilistic Models (DDPMs) for generating synthetic light curves
provides a novel solution for developing extensive training sets.
Publication date
2024-08-13Funding
Default funderDefault project
Other links
http://hdl.handle.net/2299/28248Metadata
Show full item recordThe following license files are associated with this item: