Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Victor Minden

Synthetic DOmain-Targeted Augmentation (S-DOTA) Improves Model Generalization in Digital Pathology

May 03, 2023

Sai Chowdary Gullapally, Yibo Zhang, Nitin Kumar Mittal, Deeksha Kartik, Sandhya Srinivasan, Kevin Rose, Daniel Shenker, Dinkar Juyal, Harshith Padigela, Raymond Biju(+13 more)

Abstract:Machine learning algorithms have the potential to improve patient outcomes in digital pathology. However, generalization of these tools is currently limited by sensitivity to variations in tissue preparation, staining procedures and scanning equipment that lead to domain shift in digitized slides. To overcome this limitation and improve model generalization, we studied the effectiveness of two Synthetic DOmain-Targeted Augmentation (S-DOTA) methods, namely CycleGAN-enabled Scanner Transform (ST) and targeted Stain Vector Augmentation (SVA), and compared them against the International Color Consortium (ICC) profile-based color calibration (ICC Cal) method and a baseline method using traditional brightness, color and noise augmentations. We evaluated the ability of these techniques to improve model generalization to various tasks and settings: four models, two model types (tissue segmentation and cell classification), two loss functions, six labs, six scanners, and three indications (hepatocellular carcinoma (HCC), nonalcoholic steatohepatitis (NASH), prostate adenocarcinoma). We compared these methods based on the macro-averaged F1 scores on in-distribution (ID) and out-of-distribution (OOD) test sets across multiple domains, and found that S-DOTA methods (i.e., ST and SVA) led to significant improvements over ICC Cal and baseline on OOD data while maintaining comparable performance on ID data. Thus, we demonstrate that S-DOTA may help address generalization due to domain shift in real world applications.

Via

Access Paper or Ask Questions

Efficient Principal Subspace Projection of Streaming Data Through Fast Similarity Matching

Aug 06, 2018

Andrea Giovannucci, Victor Minden, Cengiz Pehlevan, Dmitri B. Chklovskii

Figure 1 for Efficient Principal Subspace Projection of Streaming Data Through Fast Similarity Matching

Figure 2 for Efficient Principal Subspace Projection of Streaming Data Through Fast Similarity Matching

Figure 3 for Efficient Principal Subspace Projection of Streaming Data Through Fast Similarity Matching

Figure 4 for Efficient Principal Subspace Projection of Streaming Data Through Fast Similarity Matching

Abstract:Big data problems frequently require processing datasets in a streaming fashion, either because all data are available at once but collectively are larger than available memory or because the data intrinsically arrive one data point at a time and must be processed online. Here, we introduce a computationally efficient version of similarity matching, a framework for online dimensionality reduction that incrementally estimates the top K-dimensional principal subspace of streamed data while keeping in memory only the last sample and the current iterate. To assess the performance of our approach, we construct and make public a test suite containing both a synthetic data generator and the infrastructure to test online dimensionality reduction algorithms on real datasets, as well as performant implementations of our algorithm and competing algorithms with similar aims. Among the algorithms considered we find our approach to be competitive, performing among the best on both synthetic and real data.

* 9 pages, 4 figures

Via

Access Paper or Ask Questions

Sparse canonical correlation analysis

Jun 02, 2017

Xiaotong Suo, Victor Minden, Bradley Nelson, Robert Tibshirani, Michael Saunders

Figure 1 for Sparse canonical correlation analysis

Figure 2 for Sparse canonical correlation analysis

Figure 3 for Sparse canonical correlation analysis

Figure 4 for Sparse canonical correlation analysis

Abstract:Canonical correlation analysis was proposed by Hotelling [6] and it measures linear relationship between two multidimensional variables. In high dimensional setting, the classical canonical correlation analysis breaks down. We propose a sparse canonical correlation analysis by adding l1 constraints on the canonical vectors and show how to solve it efficiently using linearized alternating direction method of multipliers (ADMM) and using TFOCS as a black box. We illustrate this idea on simulated data.

Via

Access Paper or Ask Questions