Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Another Use of SMOTE for Interpretable Data Collaboration Analysis

Aug 26, 2022

Akira Imakura, Masateru Kihira, Yukihiko Okada, Tetsuya Sakurai

Figure 1 for Another Use of SMOTE for Interpretable Data Collaboration Analysis

Figure 2 for Another Use of SMOTE for Interpretable Data Collaboration Analysis

Figure 3 for Another Use of SMOTE for Interpretable Data Collaboration Analysis

Figure 4 for Another Use of SMOTE for Interpretable Data Collaboration Analysis

Share this with someone who'll enjoy it:

Abstract:Recently, data collaboration (DC) analysis has been developed for privacy-preserving integrated analysis across multiple institutions. DC analysis centralizes individually constructed dimensionality-reduced intermediate representations and realizes integrated analysis via collaboration representations without sharing the original data. To construct the collaboration representations, each institution generates and shares a shareable anchor dataset and centralizes its intermediate representation. Although, random anchor dataset functions well for DC analysis in general, using an anchor dataset whose distribution is close to that of the raw dataset is expected to improve the recognition performance, particularly for the interpretable DC analysis. Based on an extension of the synthetic minority over-sampling technique (SMOTE), this study proposes an anchor data construction technique to improve the recognition performance without increasing the risk of data leakage. Numerical results demonstrate the efficiency of the proposed SMOTE-based method over the existing anchor data constructions for artificial and real-world datasets. Specifically, the proposed method achieves 9 percentage point and 38 percentage point performance improvements regarding accuracy and essential feature selection, respectively, over existing methods for an income dataset. The proposed method provides another use of SMOTE not for imbalanced data classifications but for a key technology of privacy-preserving integrated analysis.

* 19 pages, 3 figures, 7 tables

View paper on

Share this with someone who'll enjoy it:

Title:Another Use of SMOTE for Interpretable Data Collaboration Analysis

Paper and Code