Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:CLARITY -- Comparing heterogeneous data using dissimiLARITY

May 29, 2020

Daniel J. Lawson, Vinesh Solanki, Igor Yanovich, Johannes Dellert, Damian Ruck, Phillip Endicott

Figure 1 for CLARITY -- Comparing heterogeneous data using dissimiLARITY

Figure 2 for CLARITY -- Comparing heterogeneous data using dissimiLARITY

Figure 3 for CLARITY -- Comparing heterogeneous data using dissimiLARITY

Figure 4 for CLARITY -- Comparing heterogeneous data using dissimiLARITY

Share this with someone who'll enjoy it:

Abstract:Integrating datasets from different disciplines is hard because the data are often qualitatively different in meaning, scale, and reliability. When two datasets describe the same entities, many scientific questions can be phrased around whether the similarities between entities are conserved. Our method, CLARITY, quantifies consistency across datasets, identifies where inconsistencies arise, and aids in their interpretation. We explore three diverse comparisons: Gene Methylation vs Gene Expression, evolution of language sounds vs word use, and country-level economic metrics vs cultural beliefs. The non-parametric approach is robust to noise and differences in scaling, and makes only weak assumptions about how the data were generated. It operates by decomposing similarities into two components: the `structural' component analogous to a clustering, and an underlying `relationship' between those structures. This allows a `structural comparison' between two similarity matrices using their predictability from `structure'. The software, CLARITY, is available as an R package from https://github.com/danjlawson/CLARITY.

* R package available from https://github.com/danjlawson/CLARITY . 23 pages, 6 Figures

View paper on

Share this with someone who'll enjoy it:

Title:CLARITY -- Comparing heterogeneous data using dissimiLARITY

Paper and Code