Abstract:Protecting users' privacy over the Internet is of great importance. However, due to the increasing complexity of network protocols and components, it becomes harder and harder to maintain. Therefore, investigating and understanding how data is leaked from the information transport platform/protocols can lead us to a more secure environment. In this paper, we propose an iterative framework to find the most vulnerable information fields in a network protocol systematically. To this end, focusing on the Transport Layer Security (TLS) protocol, we perform different machine-learning-based fingerprinting attacks by collecting data from more than 70 domains (websites) to understand how and where this information leakage occurs in the TLS protocol. Then, by employing the interpretation techniques developed in the machine learning community, and using our framework, we find the most vulnerable information fields in the TLS protocol. Our findings demonstrate that the TLS handshake (which is mainly unencrypted), the TLS record length appears in the TLS application data header, and the initialization vector (IV) field are among the most critical leaker parts in this protocol, respectively.
Abstract:Access to complete data in large scale networks is often infeasible. Therefore, the problem of missing data is a crucial and unavoidable issue in analysis and modeling of real-world social networks. However, most of the research on different aspects of social networks do not consider this limitation. One effective way to solve this problem is to recover the missing data as a pre-processing step. The present paper tries to infer the unobserved data from both diffusion network and network structure by learning a model from the partially observed data. We develop a probabilistic generative model called "DiffStru" to jointly discover the hidden links of network structure and the omitted diffusion activities. The interrelations among links of nodes and cascade processes are utilized in the proposed method via learning coupled low dimensional latent factors. In addition to inferring the unseen data, the learned latent factors may also help network classification problems such as community detection. Simulation results on synthetic and real-world datasets show the excellent performance of the proposed method in terms of link prediction and discovering the identity and infection time of invisible social behaviors.