Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Ernesto Jimenez-Ruiz

Department for Informatics, University of Oslo, The Alan Turing Institute

Survey on Semantic Interpretation of Tabular Data: Challenges and Directions

Nov 07, 2024

Marco Cremaschi, Blerina Spahiu, Matteo Palmonari, Ernesto Jimenez-Ruiz

Abstract:Tabular data plays a pivotal role in various fields, making it a popular format for data manipulation and exchange, particularly on the web. The interpretation, extraction, and processing of tabular information are invaluable for knowledge-intensive applications. Notably, significant efforts have been invested in annotating tabular data with ontologies and entities from background knowledge graphs, a process known as Semantic Table Interpretation (STI). STI automation aids in building knowledge graphs, enriching data, and enhancing web-based question answering. This survey aims to provide a comprehensive overview of the STI landscape. It starts by categorizing approaches using a taxonomy of 31 attributes, allowing for comparisons and evaluations. It also examines available tools, assessing them based on 12 criteria. Furthermore, the survey offers an in-depth analysis of the Gold Standards used for evaluating STI approaches. Finally, it provides practical guidance to help end-users choose the most suitable approach for their specific tasks while also discussing unresolved issues and suggesting potential future research directions.

Via

Access Paper or Ask Questions

What can knowledge graph alignment gain with Neuro-Symbolic learning approaches?

Oct 11, 2023

Pedro Giesteira Cotovio, Ernesto Jimenez-Ruiz, Catia Pesquita

Abstract:Knowledge Graphs (KG) are the backbone of many data-intensive applications since they can represent data coupled with its meaning and context. Aligning KGs across different domains and providers is necessary to afford a fuller and integrated representation. A severe limitation of current KG alignment (KGA) algorithms is that they fail to articulate logical thinking and reasoning with lexical, structural, and semantic data learning. Deep learning models are increasingly popular for KGA inspired by their good performance in other tasks, but they suffer from limitations in explainability, reasoning, and data efficiency. Hybrid neurosymbolic learning models hold the promise of integrating logical and data perspectives to produce high-quality alignments that are explainable and support validation through human-centric approaches. This paper examines the current state of the art in KGA and explores the potential for neurosymbolic integration, highlighting promising research directions for combining these fields.

Via

Access Paper or Ask Questions

Understanding Adverse Biological Effect Predictions Using Knowledge Graphs

Oct 28, 2022

Erik Bryhn Myklebust, Ernesto Jimenez-Ruiz, Jiaoyan Chen, Raoul Wolf, Knut Erik Tollefsen

Abstract:Extrapolation of adverse biological (toxic) effects of chemicals is an important contribution to expand available hazard data in (eco)toxicology without the use of animals in laboratory experiments. In this work, we extrapolate effects based on a knowledge graph (KG) consisting of the most relevant effect data as domain-specific background knowledge. An effect prediction model, with and without background knowledge, was used to predict mean adverse biological effect concentration of chemicals as a prototypical type of stressors. The background knowledge improves the model prediction performance by up to 40\% in terms of $R^2$ (\ie coefficient of determination). We use the KG and KG embeddings to provide quantitative and qualitative insights into the predictions. These insights are expected to improve the confidence in effect prediction. Larger scale implementation of such extrapolation models should be expected to support hazard and risk assessment, by simplifying and reducing testing needs.

* Under review. 29 pages

Via

Access Paper or Ask Questions

Contextual Semantic Embeddings for Ontology Subsumption Prediction

Feb 20, 2022

Jiaoyan Chen, Yuan He, Ernesto Jimenez-Ruiz, Hang Dong, Ian Horrocks

Figure 1 for Contextual Semantic Embeddings for Ontology Subsumption Prediction

Figure 2 for Contextual Semantic Embeddings for Ontology Subsumption Prediction

Figure 3 for Contextual Semantic Embeddings for Ontology Subsumption Prediction

Abstract:Automating ontology curation is a crucial task in knowledge engineering. Prediction by machine learning techniques such as semantic embedding is a promising direction, but the relevant research is still preliminary. In this paper, we present a class subsumption prediction method named BERTSubs, which uses the pre-trained language model BERT to compute contextual embeddings of the class labels and customized input templates to incorporate contexts of surrounding classes. The evaluation on two large-scale real-world ontologies has shown its better performance than the state-of-the-art.

* Short paper (5 pages)

Via

Access Paper or Ask Questions

**OWL2Vec*: Embedding of OWL Ontologies**

Sep 30, 2020

Jiaoyan Chen, Pan Hu, Ernesto Jimenez-Ruiz, Ole Magnus Holter, Denvar Antonyrajah, Ian Horrocks

Figure 1 for OWL2Vec*: Embedding of OWL Ontologies

Figure 2 for OWL2Vec*: Embedding of OWL Ontologies

Figure 3 for OWL2Vec*: Embedding of OWL Ontologies

Figure 4 for OWL2Vec*: Embedding of OWL Ontologies

Abstract:Semantic embedding of knowledge graphs has been widely studied and used for prediction and statistical analysis tasks across various domains such as Natural Language Processing and the Semantic Web. However, less attention has been paid to developing robust methods for embedding OWL (Web Ontology Language) ontologies. In this paper, we propose a language model based ontology embedding method named OWL2Vec*, which encodes the semantics of an ontology by taking into account its graph structure, lexical information and logic constructors. Our empirical evaluation with three real world datasets suggests that OWL2Vec* benefits from these three different aspects of an ontology in class membership prediction and class subsumption prediction tasks. Furthermore, OWL2Vec* often significantly outperforms the state-of-the-art methods in our experiments.

Via

Access Paper or Ask Questions

Correcting Knowledge Base Assertions

Jan 19, 2020

Jiaoyan Chen, Xi Chen, Ian Horrocks, Ernesto Jimenez-Ruiz, Erik B. Myklebus

Figure 1 for Correcting Knowledge Base Assertions

Figure 2 for Correcting Knowledge Base Assertions

Figure 3 for Correcting Knowledge Base Assertions

Figure 4 for Correcting Knowledge Base Assertions

Abstract:The usefulness and usability of knowledge bases (KBs) is often limited by quality issues. One common issue is the presence of erroneous assertions, often caused by lexical or semantic confusion. We study the problem of correcting such assertions, and present a general correction framework which combines lexical matching, semantic embedding, soft constraint mining and semantic consistency checking. The framework is evaluated using DBpedia and an enterprise medical KB.

* Accepted by The Web Conference (WWW) 2020

Via

Access Paper or Ask Questions

Enabling Semantic Data Access for Toxicological Risk Assessment

Sep 11, 2019

Erik Bryhn Myklebust, Ernesto Jimenez-Ruiz, Jiaoyan Chen, Raoul Wolf, Knut Erik Tollefsen

Figure 1 for Enabling Semantic Data Access for Toxicological Risk Assessment

Figure 2 for Enabling Semantic Data Access for Toxicological Risk Assessment

Figure 3 for Enabling Semantic Data Access for Toxicological Risk Assessment

Figure 4 for Enabling Semantic Data Access for Toxicological Risk Assessment

Abstract:Experimental effort and animal welfare are concerns when exploring the effects a compound has on an organism. Appropriate methods for extrapolating chemical effects can further mitigate these challenges. In this paper we present the efforts to (i) (pre)process and gather data from public and private sources, varying from tabular files to SPARQL endpoints, (ii) integrate the data and represent them as a knowledge graph with richer semantics. This knowledge graph is further applied to facilitate the retrieval of the relevant data for a ecological risk assessment task, extrapolation of effect data, where two prediction techniques are developed.

* Submitted to a conference. 14 pages

Via

Access Paper or Ask Questions

Knowledge Graph Embedding for Ecotoxicological Effect Prediction

Jul 02, 2019

Erik B. Myklebust, Ernesto Jimenez-Ruiz, Jiaoyan Chen, Raoul Wolf, Knut Erik Tollefsen

Figure 1 for Knowledge Graph Embedding for Ecotoxicological Effect Prediction

Figure 2 for Knowledge Graph Embedding for Ecotoxicological Effect Prediction

Figure 3 for Knowledge Graph Embedding for Ecotoxicological Effect Prediction

Figure 4 for Knowledge Graph Embedding for Ecotoxicological Effect Prediction

Abstract:Exploring the effects a chemical compound has on a species takes a considerable experimental effort. Appropriate methods for estimating and suggesting new effects can dramatically reduce the work needed to be done by a laboratory. In this paper we explore the suitability of using a knowledge graph embedding approach for ecotoxicological effect prediction. A knowledge graph has been constructed from publicly available data sets, including a species taxonomy and chemical classification and similarity. The publicly available effect data is integrated to the knowledge graph using ontology alignment techniques. Our experimental results show that the knowledge graph based approach improves the selected baselines.

* Preprint of in-use track paper at the International Semantic Web Conference (ISWC) 2019, Auckland, NZ

Via

Access Paper or Ask Questions

Canonicalizing Knowledge Base Literals

Jun 26, 2019

Jiaoyan Chen, Ernesto Jimenez-Ruiz, Ian Horrocks

Figure 1 for Canonicalizing Knowledge Base Literals

Figure 2 for Canonicalizing Knowledge Base Literals

Figure 3 for Canonicalizing Knowledge Base Literals

Figure 4 for Canonicalizing Knowledge Base Literals

Abstract:Ontology-based knowledge bases (KBs) like DBpedia are very valuable resources, but their usefulness and usability is limited by various quality issues. One such issue is the use of string literals instead of semantically typed entities. In this paper we study the automated canonicalization of such literals, i.e., replacing the literal with an existing entity from the KB or with a new entity that is typed using classes from the KB. We propose a framework that combines both reasoning and machine learning in order to predict the relevant entities and types, and we evaluate this framework against state-of-the-art baselines for both semantic typing and entity matching.

* International Semantic Web Conference (ISWC) 2019

Via

Access Paper or Ask Questions

Learning Semantic Annotations for Tabular Data

May 30, 2019

Jiaoyan Chen, Ernesto Jimenez-Ruiz, Ian Horrocks, Charles Sutton

Figure 1 for Learning Semantic Annotations for Tabular Data

Figure 2 for Learning Semantic Annotations for Tabular Data

Figure 3 for Learning Semantic Annotations for Tabular Data

Figure 4 for Learning Semantic Annotations for Tabular Data

Abstract:The usefulness of tabular data such as web tables critically depends on understanding their semantics. This study focuses on column type prediction for tables without any meta data. Unlike traditional lexical matching-based methods, we propose a deep prediction model that can fully exploit a table's contextual semantics, including table locality features learned by a Hybrid Neural Network (HNN), and inter-column semantics features learned by a knowledge base (KB) lookup and query answering algorithm.It exhibits good performance not only on individual table sets, but also when transferring from one table set to another.

* IJCAI 2019
* 7 pages

Via

Access Paper or Ask Questions