Abstract:We explore the applicability of the causal analysis based on temporally shifted (lagged) Pearson correlation applied to diverse time series of different natures in context of the problem of financial market prediction. Theoretical discussion is followed by description of the practical approach for specific environment of time series data with diverse nature and sparsity, as applied for environments of financial markets. The data involves various financial metrics computable from raw market data such as real-time trades and snapshots of the limit order book as well as metrics determined upon social media news streams such as sentiment and different cognitive distortions. The approach is backed up with presentation of algorithmic framework for data acquisition and analysis, concluded with experimental results, and summary pointing out at the possibility to discriminate causal connections between different sorts of real field market data with further discussion on present issues and possible directions of the following work.
Abstract:In this paper, we explore the usability of different natural language processing models for the sentiment analysis of social media applied to financial market prediction, using the cryptocurrency domain as a reference. We study how the different sentiment metrics are correlated with the price movements of Bitcoin. For this purpose, we explore different methods to calculate the sentiment metrics from a text finding most of them not very accurate for this prediction task. We find that one of the models outperforms more than 20 other public ones and makes it possible to fine-tune it efficiently given its interpretable nature. Thus we confirm that interpretable artificial intelligence and natural language processing methods might be more valuable practically than non-explainable and non-interpretable ones. In the end, we analyse potential causal connections between the different sentiment metrics and the price movements.