Abstract:In this work, we propose a deep learning-based classification model of astronomical objects using alerts reported by the Zwicky Transient Facility (ZTF) survey. The model takes as inputs sequences of stamp images and metadata contained in each alert, as well as features from the All-WISE catalog. The proposed model, called temporal stamp classifier, is able to discriminate between three classes of astronomical objects: Active Galactic Nuclei (AGN), Super-Novae (SNe) and Variable Stars (VS), with an accuracy of approximately 98% in the test set, when using 2 to 5 detections. The results show that the model performance improves with the addition of more detections. Simple recurrence models obtain competitive results with those of more complex models such as LSTM.We also propose changes to the original stamp classifier model, which only uses the first detection. The performance of the latter model improves with changes in the architecture and the addition of random rotations, achieving a 1.46% increase in test accuracy.
Abstract:Due to the latest advances in technology, telescopes with significant sky coverage will produce millions of astronomical alerts per night that must be classified both rapidly and automatically. Currently, classification consists of supervised machine learning algorithms whose performance is limited by the number of existing annotations of astronomical objects and their highly imbalanced class distributions. In this work, we propose a data augmentation methodology based on Generative Adversarial Networks (GANs) to generate a variety of synthetic light curves from variable stars. Our novel contributions, consisting of a resampling technique and an evaluation metric, can assess the quality of generative models in unbalanced datasets and identify GAN-overfitting cases that the Fr\'echet Inception Distance does not reveal. We applied our proposed model to two datasets taken from the Catalina and Zwicky Transient Facility surveys. The classification accuracy of variable stars is improved significantly when training with synthetic data and testing with real data with respect to the case of using only real data.
Abstract:In astronomical surveys, such as the Zwicky Transient Facility (ZTF), supernovae (SNe) are relatively uncommon objects compared to other classes of variable events. Along with this scarcity, the processing of multi-band light-curves is a challenging task due to the highly irregular cadence, long time gaps, missing-values, low number of observations, etc. These issues are particularly detrimental for the analysis of transient events with SN-like light-curves. In this work, we offer three main contributions. First, based on temporal modulation and attention mechanisms, we propose a Deep Attention model called TimeModAttn to classify multi-band light-curves of different SN types, avoiding photometric or hand-crafted feature computations, missing-values assumptions, and explicit imputation and interpolation methods. Second, we propose a model for the synthetic generation of SN multi-band light-curves based on the Supernova Parametric Model (SPM). This allows us to increase the number of samples and the diversity of the cadence. The TimeModAttn model is first pre-trained using synthetic light-curves in a semi-supervised learning scheme. Then, a fine-tuning process is performed for domain adaptation. The proposed TimeModAttn model outperformed a Random Forest classifier, increasing the balanced-$F_1$score from $\approx.525$ to $\approx.596$. The TimeModAttn model also outperformed other Deep Learning models, based on Recurrent Neural Networks (RNNs), in two scenarios: late-classification and early-classification. Finally, we conduct interpretability experiments. High attention scores are obtained for observations earlier than and close to the SN brightness peaks, which are supported by an early and highly expressive learned temporal modulation.
Abstract:We present a real-time stamp classifier of astronomical events for the ALeRCE (Automatic Learning for the Rapid Classification of Events) broker. The classifier is based on a convolutional neural network with an architecture designed to exploit rotational invariance of the images, and trained on alerts ingested from the Zwicky Transient Facility (ZTF). Using only the \textit{science, reference} and \textit{difference} images of the first detection as inputs, along with the metadata of the alert as features, the classifier is able to correctly classify alerts from active galactic nuclei, supernovae (SNe), variable stars, asteroids and bogus classes, with high accuracy ($\sim$94\%) in a balanced test set. In order to find and analyze SN candidates selected by our classifier from the ZTF alert stream, we designed and deployed a visualization tool called SN Hunter, where relevant information about each possible SN is displayed for the experts to choose among candidates to report to the Transient Name Server database. We have reported 3060 SN candidates to date (9.2 candidates per day on average), of which 394 have been confirmed spectroscopically. Our ability to report objects using only a single detection means that 92\% of the reported SNe occurred within one day after the first detection. ALeRCE has only reported candidates not otherwise detected or selected by other groups, therefore adding new early transients to the bulk of objects available for early follow-up. Our work represents an important milestone toward rapid alert classifications with the next generation of large etendue telescopes, such as the Vera C. Rubin Observatory's Legacy Survey of Space and Time.
Abstract:The brain electrical activity presents several short events during sleep that can be observed as distinctive micro-structures in the electroencephalogram (EEG), such as sleep spindles and K-complexes. These events have been associated with biological processes and neurological disorders, making them a research topic in sleep medicine. However, manual detection limits their study because it is time-consuming and affected by significant inter-expert variability, motivating automatic approaches. We propose a deep learning approach based on convolutional and recurrent neural networks for sleep EEG event detection called Recurrent Event Detector (RED). RED uses one of two input representations: a) the time-domain EEG signal, or b) a complex spectrogram of the signal obtained with the Continuous Wavelet Transform (CWT). Unlike previous approaches, a fixed time window is avoided and temporal context is integrated to better emulate the visual criteria of experts. When evaluated on the MASS dataset, our detectors outperform the state of the art in both sleep spindle and K-complex detection with a mean F1-score of at least 80.9% and 82.6%, respectively. Although the CWT-domain model obtained a similar performance than its time-domain counterpart, the former allows in principle a more interpretable input representation due to the use of a spectrogram. The proposed approach is event-agnostic and can be used directly to detect other types of sleep events.
Abstract:The training dynamics of hidden layers in deep learning are poorly understood in theory. Recently, the Information Plane (IP) was proposed to analyze them, which is based on the information-theoretic concept of mutual information (MI). The Information Bottleneck (IB) theory predicts that layers maximize relevant information and compress irrelevant information. Due to the limitations in MI estimation from samples, there is an ongoing debate about the properties of the IP for the supervised learning case. In this work, we derive a theoretical convergence for the IP of autoencoders. The theory predicts that ideal autoencoders with a large bottleneck layer size do not compress input information, whereas a small size causes compression only in the encoder layers. For the experiments, we use a Gram-matrix based MI estimator recently proposed in the literature. We propose a new rule to adjust its parameters that compensates scale and dimensionality effects. Using our proposed rule, we obtain experimental IPs closer to the theory. Our theoretical IP for autoencoders could be used as a benchmark to validate new methods to estimate MI in neural networks. In this way, experimental limitations could be recognized and corrected, helping with the ongoing debate on the supervised learning case.
Abstract:In this work, we propose several enhancements to a geometric transformation based model for anomaly detection in images (GeoTranform). The model assumes that the anomaly class is unknown and that only inlier samples are available for training. We introduce new filter based transformations useful for detecting anomalies in astronomical images, that highlight artifact properties to make them more easily distinguishable from real objects. In addition, we propose a transformation selection strategy that allows us to find indistinguishable pairs of transformations. This results in an improvement of the area under the Receiver Operating Characteristic curve (AUROC) and accuracy performance, as well as in a dimensionality reduction. The models were tested on astronomical images from the High Cadence Transient Survey (HiTS) and Zwicky Transient Facility (ZTF) datasets. The best models obtained an average AUROC of 99.20% for HiTS and 91.39% for ZTF. The improvement over the original GeoTransform algorithm and baseline methods such as One-Class Support Vector Machine, and deep learning based methods is significant both statistically and in practice.
Abstract:We introduce Deep-HiTS, a rotation invariant convolutional neural network (CNN) model for classifying images of transients candidates into artifacts or real sources for the High cadence Transient Survey (HiTS). CNNs have the advantage of learning the features automatically from the data while achieving high performance. We compare our CNN model against a feature engineering approach using random forests (RF). We show that our CNN significantly outperforms the RF model reducing the error by almost half. Furthermore, for a fixed number of approximately 2,000 allowed false transient candidates per night we are able to reduce the miss-classified real transients by approximately 1/5. To the best of our knowledge, this is the first time CNNs have been used to detect astronomical transient events. Our approach will be very useful when processing images from next generation instruments such as the Large Synoptic Survey Telescope (LSST). We have made all our code and data available to the community for the sake of allowing further developments and comparisons at https://github.com/guille-c/Deep-HiTS.
Abstract:In this work we present a review of the state of the art of information theoretic feature selection methods. The concepts of feature relevance, redundance and complementarity (synergy) are clearly defined, as well as Markov blanket. The problem of optimal feature selection is defined. A unifying theoretical framework is described, which can retrofit successful heuristic criteria, indicating the approximations made by each method. A number of open problems in the field are presented.
Abstract:In this letter, we propose a method for period estimation in light curves from periodic variable stars using correntropy. Light curves are astronomical time series of stellar brightness over time, and are characterized as being noisy and unevenly sampled. We propose to use slotted time lags in order to estimate correntropy directly from irregularly sampled time series. A new information theoretic metric is proposed for discriminating among the peaks of the correntropy spectral density. The slotted correntropy method outperformed slotted correlation, string length, VarTools (Lomb-Scargle periodogram and Analysis of Variance), and SigSpec applications on a set of light curves drawn from the MACHO survey.