Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Jarek Nabrzyski

Providing Assurance and Scrutability on Shared Data and Machine Learning Models with Verifiable Credentials

May 13, 2021

Iain Barclay, Alun Preece, Ian Taylor, Swapna K. Radha, Jarek Nabrzyski

Figure 1 for Providing Assurance and Scrutability on Shared Data and Machine Learning Models with Verifiable Credentials

Figure 2 for Providing Assurance and Scrutability on Shared Data and Machine Learning Models with Verifiable Credentials

Figure 3 for Providing Assurance and Scrutability on Shared Data and Machine Learning Models with Verifiable Credentials

Figure 4 for Providing Assurance and Scrutability on Shared Data and Machine Learning Models with Verifiable Credentials

Abstract:Adopting shared data resources requires scientists to place trust in the originators of the data. When shared data is later used in the development of artificial intelligence (AI) systems or machine learning (ML) models, the trust lineage extends to the users of the system, typically practitioners in fields such as healthcare and finance. Practitioners rely on AI developers to have used relevant, trustworthy data, but may have limited insight and recourse. This paper introduces a software architecture and implementation of a system based on design patterns from the field of self-sovereign identity. Scientists can issue signed credentials attesting to qualities of their data resources. Data contributions to ML models are recorded in a bill of materials (BOM), which is stored with the model as a verifiable credential. The BOM provides a traceable record of the supply chain for an AI system, which facilitates on-going scrutiny of the qualities of the contributing components. The verified BOM, and its linkage to certified data qualities, is used in the AI Scrutineer, a web-based tool designed to offer practitioners insight into ML model constituents and highlight any problems with adopted datasets, should they be found to have biased data or be otherwise discredited.

* This is the submitted, pre-peer reviewed version of this paper

Via

Access Paper or Ask Questions

Techniques and Applications for Crawling, Ingesting and Analyzing Blockchain Data

Sep 22, 2019

Evan Brinckman, Andrey Kuehlkamp, Jarek Nabrzyski, Ian J. Taylor

Figure 1 for Techniques and Applications for Crawling, Ingesting and Analyzing Blockchain Data

Figure 2 for Techniques and Applications for Crawling, Ingesting and Analyzing Blockchain Data

Figure 3 for Techniques and Applications for Crawling, Ingesting and Analyzing Blockchain Data

Abstract:As the public Ethereum network surpasses half a billion transactions and enterprise Blockchain systems becoming highly capable of meeting the demands of global deployments, production Blockchain applications are fast becoming commonplace across a diverse range of business and scientific verticals. In this paper, we reflect on work we have been conducting recently surrounding the ingestion, retrieval and analysis of Blockchain data. We describe the scaling and semantic challenges when extracting Blockchain data in a way that preserves the original metadata of each transaction by cross referencing the Smart Contract interface with the on-chain data. We then discuss a scientific use case in the area of Scientific workflows by describing how we can harvest data from tasks and dependencies in a generic way. We then discuss how crawled public blockchain data can be analyzed using two unsupervised machine learning algorithms, which are designed to identify outlier accounts or smart contracts in the system. We compare and contrast the two machine learning methods and cross correlate with public Websites to illustrate the effectiveness such approaches.

* Manuscript accepted for publication at ICTC 2019 (ictc.org)

Via

Access Paper or Ask Questions