Abstract:Consider a directed network where each node is either red (using the red product), blue (using the blue product), or uncolored (undecided). Then in each round, an uncolored node chooses red (resp. blue) with some probability proportional to the number of its red (resp. blue) out-neighbors. What is the best strategy to maximize the expected final number of red nodes given the budget to select $k$ red seed nodes? After proving that this problem is computationally hard, we provide a polynomial time approximation algorithm with the best possible approximation guarantee, building on the monotonicity and submodularity of the objective function and exploiting the Monte Carlo method. Furthermore, our experiments on various real-world and synthetic networks demonstrate that our proposed algorithm outperforms other algorithms. Additionally, we investigate the convergence time of the aforementioned process both theoretically and experimentally. In particular, we prove several tight bounds on the convergence time in terms of different graph parameters, such as the number of nodes/edges, maximum out-degree and diameter, by developing novel proof techniques.
Abstract:Critical nodes in networks are extremely vulnerable to malicious attacks to trigger negative cascading events such as the spread of misinformation and diseases. Therefore, effective moderation of critical nodes is very vital for mitigating the potential damages caused by such malicious diffusions. The current moderation methods are computationally expensive. Furthermore, they disregard the fundamental metric of information centrality, which measures the dissemination power of nodes. We investigate the problem of removing $k$ edges from a network to minimize the information centrality of a target node $\lea$ while preserving the network's connectivity. We prove that this problem is computationally challenging: it is NP-complete and its objective function is not supermodular. However, we propose three approximation greedy algorithms using novel techniques such as random walk-based Schur complement approximation and fast sum estimation. One of our algorithms runs in nearly linear time in the number of edges. To complement our theoretical analysis, we conduct a comprehensive set of experiments on synthetic and real networks with over one million nodes. Across various settings, the experimental results illustrate the effectiveness and efficiency of our proposed algorithms.
Abstract:Maximizing influences in complex networks is a practically important but computationally challenging task for social network analysis, due to its NP- hard nature. Most current approximation or heuristic methods either require tremendous human design efforts or achieve unsatisfying balances between effectiveness and efficiency. Recent machine learning attempts only focus on speed but lack performance enhancement. In this paper, different from previous attempts, we propose an effective deep reinforcement learning model that achieves superior performances over traditional best influence maximization algorithms. Specifically, we design an end-to-end learning framework that combines graph neural network as the encoder and reinforcement learning as the decoder, named DREIM. Trough extensive training on small synthetic graphs, DREIM outperforms the state-of-the-art baseline methods on very large synthetic and real-world networks on solution quality, and we also empirically show its linear scalability with regard to the network size, which demonstrates its superiority in solving this problem.
Abstract:As a fundamental issue in network analysis, structural node similarity has received much attention in academia and is adopted in a wide range of applications. Among these proposed structural node similarity measures, role similarity stands out because of satisfying several axiomatic properties including automorphism conformation. Existing role similarity metrics cannot handle top-k queries on large real-world networks due to the high time and space cost. In this paper, we propose a new role similarity metric, namely \textsf{ForestSim}. We prove that \textsf{ForestSim} is an admissible role similarity metric and devise the corresponding top-k similarity search algorithm, namely \textsf{ForestSimSearch}, which is able to process a top-k query in $O(k)$ time once the precomputation is finished. Moreover, we speed up the precomputation by using a fast approximate algorithm to compute the diagonal entries of the forest matrix, which reduces the time and space complexity of the precomputation to $O(\epsilon^{-2}m\log^5{n}\log{\frac{1}{\epsilon}})$ and $O(m\log^3{n})$, respectively. Finally, we conduct extensive experiments on 26 real-world networks. The results show that \textsf{ForestSim} works efficiently on million-scale networks and achieves comparable performance to the state-of-art methods.