Abstract:High-quality and high-coverage rule sets are imperative to the success of Neuro-Symbolic Knowledge Graph Completion (NS-KGC) models, because they form the basis of all symbolic inferences. Recent literature builds neural models for generating rule sets, however, preliminary experiments show that they struggle with maintaining high coverage. In this work, we suggest three simple augmentations to existing rule sets: (1) transforming rules to their abductive forms, (2) generating equivalent rules that use inverse forms of constituent relations and (3) random walks that propose new rules. Finally, we prune potentially low quality rules. Experiments over four datasets and five ruleset-baseline settings suggest that these simple augmentations consistently improve results, and obtain up to 7.1 pt MRR and 8.5 pt Hits@1 gains over using rules without augmentations.
Abstract:We consider two popular approaches to Knowledge Graph Completion (KGC): textual models that rely on textual entity descriptions, and structure-based models that exploit the connectivity structure of the Knowledge Graph (KG). Preliminary experiments show that these approaches have complementary strengths: structure-based models perform well when the gold answer is easily reachable from the query head in the KG, while textual models exploit descriptions to give good performance even when the gold answer is not reachable. In response, we explore ensembling as a way of combining the best of both approaches. We propose a novel method for learning query-dependent ensemble weights by using the distributions of scores assigned by individual models to all candidate entities. Our ensemble baseline achieves state-of-the-art results on three standard KGC datasets, with up to 6.8 pt MRR and 8.3 pt Hits@1 gains over best individual models.
Abstract:While Knowledge Graph Completion (KGC) on static facts is a matured field, Temporal Knowledge Graph Completion (TKGC), that incorporates validity time into static facts is still in its nascent stage. The KGC methods fall into multiple categories including embedding-based, rule-based, GNN-based, pretrained Language Model based approaches. However, such dimensions have not been explored in TKG. To that end, we propose a novel temporal neuro-symbolic model, NeuSTIP, that performs link prediction and time interval prediction in a TKG. NeuSTIP learns temporal rules in the presence of the Allen predicates that ensure the temporal consistency between neighboring predicates in a given rule. We further design a unique scoring function that evaluates the confidence of the candidate answers while performing link prediction and time interval prediction by utilizing the learned rules. Our empirical evaluation on two time interval based TKGC datasets suggests that our model outperforms state-of-the-art models for both link prediction and the time interval prediction task.
Abstract:In this work, we propose a novel knowledge graph alignment technique based upon string edit distance that exploits the type information between entities and can find similarity between relations of any arity
Abstract:We consider the problem of discriminatively learning restricted Boltzmann machines in the presence of relational data. Unlike previous approaches that employ a rule learner (for structure learning) and a weight learner (for parameter learning) sequentially, we develop a gradient-boosted approach that performs both simultaneously. Our approach learns a set of weak relational regression trees, whose paths from root to leaf are conjunctive clauses and represent the structure, and whose leaf values represent the parameters. When the learned relational regression trees are transformed into a lifted RBM, its hidden nodes are precisely the conjunctive clauses derived from the relational regression trees. This leads to a more interpretable and explainable model. Our empirical evaluations clearly demonstrate this aspect, while displaying no loss in effectiveness of the learned models.
Abstract:While deep networks have been enormously successful over the last decade, they rely on flat-feature vector representations, which makes them unsuitable for richly structured domains such as those arising in applications like social network analysis. Such domains rely on relational representations to capture complex relationships between entities and their attributes. Thus, we consider the problem of learning neural networks for relational data. We distinguish ourselves from current approaches that rely on expert hand-coded rules by learning relational random-walk-based features to capture local structural interactions and the resulting network architecture. We further exploit parameter tying of the network weights of the resulting relational neural network, where instances of the same type share parameters. Our experimental results across several standard relational data sets demonstrate the effectiveness of the proposed approach over multiple neural net baselines as well as state-of-the-art statistical relational models.
Abstract:Biogeography is the study of the geographical distribution of biological organisms. The mindset of the engineer is that we can learn from nature. Biogeography Based Optimization is a burgeoning nature inspired technique to find the optimal solution of the problem. Satellite image classification is an important task because it is the only way we can know about the land cover map of inaccessible areas. Though satellite images have been classified in past by using various techniques, the researchers are always finding alternative strategies for satellite image classification so that they may be prepared to select the most appropriate technique for the feature extraction task in hand. This paper is focused on classification of the satellite image of a particular land cover using the theory of Biogeography based Optimization. The original BBO algorithm does not have the inbuilt property of clustering which is required during image classification. Hence modifications have been proposed to the original algorithm and the modified algorithm is used to classify the satellite image of a given region. The results indicate that highly accurate land cover features can be extracted effectively when the proposed algorithm is used.