Abstract:Analyzing time series of fluxes from stars, known as stellar light curves, can reveal valuable information about stellar properties. However, most current methods rely on extracting summary statistics, and studies using deep learning have been limited to supervised approaches. In this research, we investigate the scaling law properties that emerge when learning from astronomical time series data using self-supervised techniques. By employing the GPT-2 architecture, we show the learned representation improves as the number of parameters increases from $10^4$ to $10^9$, with no signs of performance plateauing. We demonstrate that a self-supervised Transformer model achieves 3-10 times the sample efficiency compared to the state-of-the-art supervised learning model when inferring the surface gravity of stars as a downstream task. Our research lays the groundwork for analyzing stellar light curves by examining them through large-scale auto-regressive generative models.
Abstract:Light curves of stars encapsulate a wealth of information about stellar oscillations and granulation, thereby offering key insights into the internal structure and evolutionary state of stars. Conventional asteroseismic techniques have been largely confined to power spectral analysis, neglecting the valuable phase information contained within light curves. While recent machine learning applications in asteroseismology utilizing Convolutional Neural Networks (CNNs) have successfully inferred stellar attributes from light curves, they are often limited by the local feature extraction inherent in convolutional operations. To circumvent these constraints, we present $\textit{Astroconformer}$, a Transformer-based deep learning framework designed to capture long-range dependencies in stellar light curves. Our empirical analysis, which focuses on estimating surface gravity ($\log g$), is grounded in a carefully curated dataset derived from $\textit{Kepler}$ light curves. These light curves feature asteroseismic $\log g$ values spanning from 0.2 to 4.4. Our results underscore that, in the regime where the training data is abundant, $\textit{Astroconformer}$ attains a root-mean-square-error (RMSE) of 0.017 dex around $\log g \approx 3 $. Even in regions where training data are sparse, the RMSE can reach 0.1 dex. It outperforms not only the K-nearest neighbor-based model ($\textit{The SWAN}$) but also state-of-the-art CNNs. Ablation studies confirm that the efficacy of the models in this particular task is strongly influenced by the size of their receptive fields, with larger receptive fields correlating with enhanced performance. Moreover, we find that the attention mechanisms within $\textit{Astroconformer}$ are well-aligned with the inherent characteristics of stellar oscillations and granulation present in the light curves.