Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Johannes Stegmüller

Bravo MaRDI: A Wikibase Powered Knowledge Graph on Mathematics

Sep 20, 2023

Moritz Schubotz, Eloi Ferrer, Johannes Stegmüller, Daniel Mietchen, Olaf Teschke, Larissa Pusch, Tim OF Conrad

Figure 1 for Bravo MaRDI: A Wikibase Powered Knowledge Graph on Mathematics

Figure 2 for Bravo MaRDI: A Wikibase Powered Knowledge Graph on Mathematics

Abstract:Mathematical world knowledge is a fundamental component of Wikidata. However, to date, no expertly curated knowledge graph has focused specifically on contemporary mathematics. Addressing this gap, the Mathematical Research Data Initiative (MaRDI) has developed a comprehensive knowledge graph that links multimodal research data in mathematics. This encompasses traditional research data items like datasets, software, and publications and includes semantically advanced objects such as mathematical formulas and hypotheses. This paper details the abilities of the MaRDI knowledge graph, which is based on Wikibase, leading up to its inaugural public release, codenamed Bravo, available on https://portal.mardi4nfdi.de.

* Accepted at Wikidata'23: Wikidata workshop at ISWC 2023

Via

Access Paper or Ask Questions

Detecting Cross-Language Plagiarism using Open Knowledge Graphs

Nov 18, 2021

Johannes Stegmüller, Fabian Bauer-Marquart, Norman Meuschke, Terry Ruas, Moritz Schubotz, Bela Gipp

Figure 1 for Detecting Cross-Language Plagiarism using Open Knowledge Graphs

Figure 2 for Detecting Cross-Language Plagiarism using Open Knowledge Graphs

Figure 3 for Detecting Cross-Language Plagiarism using Open Knowledge Graphs

Figure 4 for Detecting Cross-Language Plagiarism using Open Knowledge Graphs

Abstract:Identifying cross-language plagiarism is challenging, especially for distant language pairs and sense-for-sense translations. We introduce the new multilingual retrieval model Cross-Language Ontology-Based Similarity Analysis (CL\nobreakdash-OSA) for this task. CL-OSA represents documents as entity vectors obtained from the open knowledge graph Wikidata. Opposed to other methods, CL\nobreakdash-OSA does not require computationally expensive machine translation, nor pre-training using comparable or parallel corpora. It reliably disambiguates homonyms and scales to allow its application to Web-scale document collections. We show that CL-OSA outperforms state-of-the-art methods for retrieving candidate documents from five large, topically diverse test corpora that include distant language pairs like Japanese-English. For identifying cross-language plagiarism at the character level, CL-OSA primarily improves the detection of sense-for-sense translations. For these challenging cases, CL-OSA's performance in terms of the well-established PlagDet score exceeds that of the best competitor by more than factor two. The code and data of our study are openly available.

* 10 pages, EEKE21, Preprint

Via

Access Paper or Ask Questions