Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Olaf Teschke

An Overview of zbMATH Open Digital Library

Oct 09, 2024

Madhurima Deb, Isabel Beckenbach, Matteo Petrera, Dariush Ehsani, Marcel Fuhrmann, Yun Hao, Olaf Teschke, Moritz Schubotz

Figure 1 for An Overview of zbMATH Open Digital Library

Figure 2 for An Overview of zbMATH Open Digital Library

Figure 3 for An Overview of zbMATH Open Digital Library

Figure 4 for An Overview of zbMATH Open Digital Library

Abstract:Mathematical research thrives on the effective dissemination and discovery of knowledge. zbMATH Open has emerged as a pivotal platform in this landscape, offering a comprehensive repository of mathematical literature. Beyond indexing and abstracting, it serves as a unified quality-assured infrastructure for finding, evaluating, and connecting mathematical information that advances mathematical research as well as interdisciplinary exploration. zbMATH Open enables scientific quality control by post-publication reviews and promotes connections between researchers, institutions, and research outputs. This paper represents the functionalities of the most significant features of this open-access service, highlighting its role in shaping the future of mathematical information retrieval.

Via

Access Paper or Ask Questions

Reducing the climate impact of data portals: a case study

Jun 06, 2024

Noah Gießing, Madhurima Deb, Ankit Satpute, Moritz Schubotz, Olaf Teschke

Abstract:The carbon footprint share of the information and communication technology (ICT) sector has steadily increased in the past decade and is predicted to make up as much as 23 \% of global emissions in 2030. This shows a pressing need for developers, including the information retrieval community, to make their code more energy-efficient. In this project proposal, we discuss techniques to reduce the energy footprint of the MaRDI (Mathematical Research Data Initiative) Portal, a MediaWiki-based knowledge base. In future work, we plan to implement these changes and provide concrete measurements on the gain in energy efficiency. Researchers developing similar knowledge bases can adapt our measures to reduce their environmental footprint. In this way, we are working on mitigating the climate impact of Information Retrieval research.

* 4 pages

Via

Access Paper or Ask Questions

Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange

Mar 30, 2024

Ankit Satpute, Noah Giessing, Andre Greiner-Petter, Moritz Schubotz, Olaf Teschke, Akiko Aizawa, Bela Gipp

Figure 1 for Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange

Figure 2 for Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange

Figure 3 for Can LLMs Master Math? Investigating Large Language Models on Math Stack Exchange

Abstract:Large Language Models (LLMs) have demonstrated exceptional capabilities in various natural language tasks, often achieving performances that surpass those of humans. Despite these advancements, the domain of mathematics presents a distinctive challenge, primarily due to its specialized structure and the precision it demands. In this study, we adopted a two-step approach for investigating the proficiency of LLMs in answering mathematical questions. First, we employ the most effective LLMs, as identified by their performance on math question-answer benchmarks, to generate answers to 78 questions from the Math Stack Exchange (MSE). Second, a case analysis is conducted on the LLM that showed the highest performance, focusing on the quality and accuracy of its answers through manual evaluation. We found that GPT-4 performs best (nDCG of 0.48 and P@10 of 0.37) amongst existing LLMs fine-tuned for answering mathematics questions and outperforms the current best approach on ArqMATH3 Task1, considering P@10. Our Case analysis indicates that while the GPT-4 can generate relevant responses in certain instances, it does not consistently answer all questions accurately. This paper explores the current limitations of LLMs in navigating complex mathematical problem-solving. Through case analysis, we shed light on the gaps in LLM capabilities within mathematics, thereby setting the stage for future research and advancements in AI-driven mathematical reasoning. We make our code and findings publicly available for research: \url{https://github.com/gipplab/LLM-Investig-MathStackExchange}

* Accepted for publication at the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR) July 14--18, 2024, Washington D.C.,USA

Via

Access Paper or Ask Questions

Taxonomy of Mathematical Plagiarism

Jan 30, 2024

Ankit Satpute, Andre Greiner-Petter, Noah Gießing, Isabel Beckenbach, Moritz Schubotz, Olaf Teschke, Akiko Aizawa, Bela Gipp

Abstract:Plagiarism is a pressing concern, even more so with the availability of large language models. Existing plagiarism detection systems reliably find copied and moderately reworded text but fail for idea plagiarism, especially in mathematical science, which heavily uses formal mathematical notation. We make two contributions. First, we establish a taxonomy of mathematical content reuse by annotating potentially plagiarised 122 scientific document pairs. Second, we analyze the best-performing approaches to detect plagiarism and mathematical content similarity on the newly established taxonomy. We found that the best-performing methods for plagiarism and math content similarity achieve an overall detection score (PlagDet) of 0.06 and 0.16, respectively. The best-performing methods failed to detect most cases from all seven newly established math similarity types. Outlined contributions will benefit research in plagiarism detection systems, recommender systems, question-answering systems, and search engines. We make our experiment's code and annotated dataset available to the community: https://github.com/gipplab/Taxonomy-of-Mathematical-Plagiarism

* 46th European Conference on Information Retrieval (ECIR)

Via

Access Paper or Ask Questions

Bravo MaRDI: A Wikibase Powered Knowledge Graph on Mathematics

Sep 20, 2023

Moritz Schubotz, Eloi Ferrer, Johannes Stegmüller, Daniel Mietchen, Olaf Teschke, Larissa Pusch, Tim OF Conrad

Figure 1 for Bravo MaRDI: A Wikibase Powered Knowledge Graph on Mathematics

Figure 2 for Bravo MaRDI: A Wikibase Powered Knowledge Graph on Mathematics

Abstract:Mathematical world knowledge is a fundamental component of Wikidata. However, to date, no expertly curated knowledge graph has focused specifically on contemporary mathematics. Addressing this gap, the Mathematical Research Data Initiative (MaRDI) has developed a comprehensive knowledge graph that links multimodal research data in mathematics. This encompasses traditional research data items like datasets, software, and publications and includes semantically advanced objects such as mathematical formulas and hypotheses. This paper details the abilities of the MaRDI knowledge graph, which is based on Wikibase, leading up to its inaugural public release, codenamed Bravo, available on https://portal.mardi4nfdi.de.

* Accepted at Wikidata'23: Wikidata workshop at ISWC 2023

Via

Access Paper or Ask Questions

AutoMSC: Automatic Assignment of Mathematics Subject Classification Labels

May 25, 2020

Moritz Schubotz, Philipp Scharpf, Olaf Teschke, Andreas K\" uhnemund, Corinna Breitinger, Bela Gipp

Figure 1 for AutoMSC: Automatic Assignment of Mathematics Subject Classification Labels

Figure 2 for AutoMSC: Automatic Assignment of Mathematics Subject Classification Labels

Figure 3 for AutoMSC: Automatic Assignment of Mathematics Subject Classification Labels

Figure 4 for AutoMSC: Automatic Assignment of Mathematics Subject Classification Labels

Abstract:Authors of research papers in the fields of mathematics, and other math-heavy disciplines commonly employ the Mathematics Subject Classification (MSC) scheme to search for relevant literature. The MSC is a hierarchical alphanumerical classification scheme that allows librarians to specify one or multiple codes for publications. Digital Libraries in Mathematics, as well as reviewing services, such as zbMATH and Mathematical Reviews (MR) rely on these MSC labels in their workflows to organize the abstracting and reviewing process. Especially, the coarse-grained classification determines the subject editor who is responsible for the actual reviewing process. In this paper, we investigate the feasibility of automatically assigning a coarse-grained primary classification using the MSC scheme, by regarding the problem as a multi class classification machine learning task. We find that the our method achieves an (F_{1})-score of over 77%, which is remarkably close to the agreement of zbMATH and MR ((F_{1})-score of 81%). Moreover, we find that the method's confidence score allows for reducing the effort by 86\% compared to the manual coarse-grained classification effort while maintaining a precision of 81% for automatically classified articles.

Via

Access Paper or Ask Questions