Abstract:Models play a critical role in managing the vast amounts of data and increasing complexity found in the IoT, IIoT, and IoP domains. The Digital Shadow Reference Model, which serves as a foundational metadata schema for linking data and metadata in these environments, is an example of such a model. Ensuring FAIRness (adherence to the FAIR Principles) is critical because it improves data findability, accessibility, interoperability, and reusability, facilitating efficient data management and integration across systems. This paper presents an evaluation of the FAIRness of the Digital Shadow Reference Model using a structured evaluation framework based on the FAIR Data Principles. Using the concept of FAIR Implementation Profiles (FIPs), supplemented by a mini-questionnaire, we systematically evaluate the model's adherence to these principles. Our analysis identifies key strengths, including the model's metadata schema that supports rich descriptions and authentication techniques, and highlights areas for improvement, such as the need for globally unique identifiers and consequent support for different Web standards. The results provide actionable insights for improving the FAIRness of the model and promoting better data management and reuse. This research contributes to the field by providing a detailed assessment of the Digital Shadow Reference Model and recommending next steps to improve its FAIRness and usability.
Abstract:Dataspaces have recently gained adoption across various sectors, including traditionally less digitized domains such as culture. Leveraging Semantic Web technologies helps to make dataspaces FAIR, but their complexity poses a significant challenge to the adoption of dataspaces and increases their cost. The advent of Large Language Models (LLMs) raises the question of how these models can support the adoption of FAIR dataspaces. In this work, we demonstrate the potential of LLMs in dataspaces with a concrete example. We also derive a research agenda for exploring this emerging field.
Abstract:In recent years, data lakes emerged as away to manage large amounts of heterogeneous data for modern data analytics. One way to prevent data lakes from turning into inoperable data swamps is semantic data management. Some approaches propose the linkage of metadata to knowledge graphs based on the Linked Data principles to provide more meaning and semantics to the data in the lake. Such a semantic layer may be utilized not only for data management but also to tackle the problem of data integration from heterogeneous sources, in order to make data access more expressive and interoperable. In this survey, we review recent approaches with a specific focus on the application within data lake systems and scalability to Big Data. We classify the approaches into (i) basic semantic data management, (ii) semantic modeling approaches for enriching metadata in data lakes, and (iii) methods for ontologybased data access. In each category, we cover the main techniques and their background, and compare latest research. Finally, we point out challenges for future work in this research area, which needs a closer integration of Big Data and Semantic Web technologies.