Abstract:Graph embedding methods embed the nodes in a graph in low dimensional vector space while preserving graph topology to carry out the downstream tasks such as link prediction, node recommendation and clustering. These tasks depend on a similarity measure such as cosine similarity and Euclidean distance between a pair of embeddings that are symmetric in nature and hence do not hold good for directed graphs. Recent work on directed graphs, HOPE, APP, and NERD, proposed to preserve the direction of edges among nodes by learning two embeddings, source and target, for every node. However, these methods do not take into account the properties of directed edges explicitly. To understand the directional relation among nodes, we propose a novel approach that takes advantage of the non commutative property of vector cross product to learn embeddings that inherently preserve the direction of edges among nodes. We learn the node embeddings through a Siamese neural network where the cross-product operation is incorporated into the network architecture. Although cross product between a pair of vectors is defined in three dimensional, the approach is extended to learn N dimensional embeddings while maintaining the non-commutative property. In our empirical experiments on three real-world datasets, we observed that even very low dimensional embeddings could effectively preserve the directional property while outperforming some of the state-of-the-art methods on link prediction and node recommendation tasks
Abstract:Machine learning models are extensively being used to make decisions that have a significant impact on human life. These models are trained over historical data that may contain information about sensitive attributes such as race, sex, religion, etc. The presence of such sensitive attributes can impact certain population subgroups unfairly. It is straightforward to remove sensitive features from the data; however, a model could pick up prejudice from latent sensitive attributes that may exist in the training data. This has led to the growing apprehension about the fairness of the employed models. In this paper, we propose a novel algorithm that can effectively identify and treat latent discriminating features. The approach is agnostic of the learning algorithm and generalizes well for classification as well as regression tasks. It can also be used as a key aid in proving that the model is free of discrimination towards regulatory compliance if the need arises. The approach helps to collect discrimination-free features that would improve the model performance while ensuring the fairness of the model. The experimental results from our evaluations on publicly available real-world datasets show a near-ideal fairness measurement in comparison to other methods.