Abstract:The detection of earthquakes is a fundamental prerequisite for seismology and contributes to various research areas, such as forecasting earthquakes and understanding the crust/mantle structure. Recent advances in machine learning technologies have enabled the automatic detection of earthquakes from waveform data. In particular, various state-of-the-art deep-learning methods have been applied to this endeavour. In this study, we proposed and tested a novel phase detection method employing deep learning, which is based on a standard convolutional neural network in a new framework. The novelty of the proposed method is its separate explicit learning strategy for global and local representations of waveforms, which enhances its robustness and flexibility. Prior to modelling the proposed method, we identified local representations of the waveform by the multiple clustering of waveforms, in which the data points were optimally partitioned. Based on this result, we considered a global representation and two local representations of the waveform. Subsequently, different phase detection models were trained for each global and local representation. For a new waveform, the overall phase probability was evaluated as a product of the phase probabilities of each model. This additional information on local representations makes the proposed method robust to noise, which is demonstrated by its application to the test data. Furthermore, an application to seismic swarm data demonstrated the robust performance of the proposed method compared with those of other deep learning methods. Finally, in an application to low-frequency earthquakes, we demonstrated the flexibility of the proposed method, which is readily adaptable for the detection of low-frequency earthquakes by retraining only a local model.
Abstract:A multiple-view clustering method is a powerful analytical tool for high-dimensional data, such as functional magnetic resonance imaging (fMRI). It can identify clustering patterns of subjects depending on their functional connectivity in specific brain areas. However, when one applies an existing method to fMRI data, there is a need to simplify the data structure, independently dealing with elements in a functional connectivity matrix, that is, a correlation matrix. In general, elements in a correlation matrix are closely associated. Hence, such a simplification may distort the clustering results. To overcome this problem, we propose a novel multiple-view clustering method based on the Wishart mixture model, which preserves the correlation matrix structure. The uniqueness of this method is that the multiple-view clustering of subjects is based on particular networks of nodes (or regions of interest (ROIs) in fMRI), optimized in a data-driven manner. Hence, it can identify multiple underlying pairs of associations between a subject cluster solution and a ROI network. The key assumption of the method is independence among networks, which is effectively addressed by whitening correlation matrices. We applied the proposed method to synthetic and fMRI data, demonstrating the usefulness and power of the proposed method.
Abstract:We propose a novel method for multiple clustering that assumes a co-clustering structure (partitions in both rows and columns of the data matrix) in each view. The new method is applicable to high-dimensional data. It is based on a nonparametric Bayesian approach in which the number of views and the number of feature-/subject clusters are inferred in a data-driven manner. We simultaneously model different distribution families, such as Gaussian, Poisson, and multinomial distributions in each cluster block. This makes our method applicable to datasets consisting of both numerical and categorical variables, which biomedical data typically do. Clustering solutions are based on variational inference with mean field approximation. We apply the proposed method to synthetic and real data, and show that our method outperforms other multiple clustering methods both in recovering true cluster structures and in computation time. Finally, we apply our method to a depression dataset with no true cluster structure available, from which useful inferences are drawn about possible clustering structures of the data.