Abstract:This paper discusses the capacity of graph neural networks to learn the functional form of ordinary differential equations that govern dynamics on complex networks. We propose necessary elements for such a problem, namely, inductive biases, a neural network architecture and a learning task. Statistical learning theory suggests that generalisation power of neural networks relies on independence and identical distribution (i.i.d.)\ of training and testing data. Although this assumption together with an appropriate neural architecture and a learning mechanism is sufficient for accurate out-of-sample predictions of dynamics such as, e.g.\ mass-action kinetics, by studying the out-of-distribution generalisation in the case of diffusion dynamics, we find that the neural network model: (i) has a generalisation capacity that depends on the first moment of the initial value data distribution; (ii) learns the non-dissipative nature of dynamics implicitly; and (iii) the model's accuracy resolution limit is of order $\mathcal{O}(1/\sqrt{n})$ for a system of size $n$.
Abstract:Community Question Answering (CQA) websites have become valuable knowledge repositories where individuals exchange information by asking and answering questions. With an ever-increasing number of questions and high migration of users in and out of communities, a key challenge is to design effective strategies for recommending experts for new questions. In this paper, we propose a simple graph-diffusion expert recommendation model for CQA, that can outperform state-of-the art deep learning representatives and collaborative models. Our proposed method learns users' expertise in the context of both semantic and temporal information to capture their changing interest and activity levels with time. Experiments on five real-world datasets from the Stack Exchange network demonstrate that our approach outperforms competitive baseline methods. Further, experiments on cold-start users (users with a limited historical record) show our model achieves an average of ~ 30% performance gain compared to the best baseline method.
Abstract:Question Routing in Community-based Question Answering websites aims at recommending newly posted questions to potential users who are most likely to provide "accepted answers". Most of the existing approaches predict users' expertise based on their past question answering behavior and the content of new questions. However, these approaches suffer from challenges in three aspects: 1) sparsity of users' past records results in lack of personalized recommendation that at times does not match users' interest or domain expertise, 2) modeling based on all questions and answers content makes periodic updates computationally expensive, and 3) while CQA sites are highly dynamic, they are mostly considered as static. This paper proposes a novel approach to QR that addresses the above challenges. It is based on dynamic modeling of users' activity on topic communities. Experimental results on three real-world datasets demonstrate that the proposed model significantly outperforms competitive baseline models
Abstract:Understanding the variations in trading price (volatility), and its response to external information is a well-studied topic in finance. In this study, we focus on volatility predictions for a relatively new asset class of cryptocurrencies (in particular, Bitcoin) using deep learning representations of public social media data from Twitter. For the field work, we extracted semantic information and user interaction statistics from over 30 million Bitcoin-related tweets, in conjunction with 15-minute intraday price data over a 144-day horizon. Using this data, we built several deep learning architectures that utilized a combination of the gathered information. For all architectures, we conducted ablation studies to assess the influence of each component and feature set in our model. We found statistical evidences for the hypotheses that: (i) temporal convolutional networks perform significantly better than both autoregressive and other deep learning-based models in the literature, and (ii) the tweet author meta-information, even detached from the tweet itself, is a better predictor than the semantic content and tweet volume statistics.
Abstract:Although optimal control problems of dynamical systems can be formulated within the framework of variational calculus, their solution for complex systems is often analytically and computationally intractable. In this Letter we present a versatile neural ordinary-differential-equation control (NODEC) framework with implicit energy regularization and use it to obtain neural-network-generated control signals that can steer dynamical systems towards a desired target state within a predefined amount of time. We demonstrate the ability of NODEC to learn control signals that closely resemble those found by corresponding optimal control frameworks in terms of control energy and deviation from the desired target state. Our results suggest that NODEC is capable to solve a wide range of control and optimization problems, including those that are analytically intractable.
Abstract:We quantify the propagation and absorption of large-scale publicly available news articles from the World Wide Web to financial markets. To extract publicly available information, we use the news archives from the Common Crawl, a nonprofit organization that crawls a large part of the web. We develop a processing pipeline to identify news articles associated with the constituent companies in the S\&P 500 index, an equity market index that measures the stock performance of U.S. companies. Using machine learning techniques, we extract sentiment scores from the Common Crawl News data and employ tools from information theory to quantify the information transfer from public news articles to the U.S. stock market. Furthermore, we analyze and quantify the economic significance of the news-based information with a simple sentiment-based portfolio trading strategy. Our findings provides support for that information in publicly available news on the World Wide Web has a statistically and economically significant impact on events in financial markets.
Abstract:We study the ability of neural networks to steer or control trajectories of dynamical systems on graphs. In particular, we introduce a neural-network control (NNC) framework, which represents dynamical systems by neural ordinary different equations (neural ODEs), and find that NNC can learn control signals that drive networked dynamical systems into desired target states. To identify the influence of different target states on the NNC performance, we study two types of control: (i) microscopic control and (ii) macroscopic control. Microscopic control minimizes the L2 norm between the current and target state and macroscopic control minimizes the corresponding Wasserstein distance. We find that the proposed NNC framework produces low-energy control signals that are highly correlated with those of optimal control. Our results are robust for a wide range of graph structures and (non-)linear dynamical systems.
Abstract:For recurrent neural networks trained on time series with target and exogenous variables, in addition to accurate prediction, it is also desired to provide interpretable insights into the data. In this paper, we explore the structure of LSTM recurrent neural networks to learn variable-wise hidden states, with the aim to capture different dynamics in multi-variable time series and distinguish the contribution of variables to the prediction. With these variable-wise hidden states, a mixture attention mechanism is proposed to model the generative process of the target. Then we develop associated training methods to jointly learn network parameters, variable and temporal importance w.r.t the prediction of the target variable. Extensive experiments on real datasets demonstrate enhanced prediction performance by capturing the dynamics of different variables. Meanwhile, we evaluate the interpretation results both qualitatively and quantitatively. It exhibits the prospect as an end-to-end framework for both forecasting and knowledge extraction over multi-variable data.
Abstract:We propose a novel statistical node embedding of directed graphs, which is based on a global minimization of pairwise relative entropy and graph geodesics in a non-linear way. Each node is encoded with a probability density function over a measurable real n-dimensional space. Furthermore, we analyze the connection to the geometrical properties of such embedding and characterize the curvature of the statistical manifolds. Extensive experiments show that our proposed embedding is better preserving the global geodesic information of graphs, as well as outperforming existing embedding models on directed graphs in a variety of evaluation metrics, in an unsupervised setting.
Abstract:The ability to track and monitor relevant and important news in real-time is of crucial interest in multiple industrial sectors. In this work, we focus on the set of cryptocurrency news, which recently became of emerging interest to the general and financial audience. In order to track relevant news in real-time, we (i) match news from the web with tweets from social media, (ii) track their intraday tweet activity and (iii) explore different machine learning models for predicting the number of the article mentions on Twitter within the first 24 hours after its publication. We compare several machine learning models, such as linear extrapolation, linear and random forest autoregressive models, and a sequence-to-sequence neural network. We find that the random forest autoregressive model behaves comparably to more complex models in the majority of tasks.