Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Fantine Mordelet

CBIO

TIGRESS: Trustful Inference of Gene REgulation using Stability Selection

May 06, 2012

Anne-Claire Haury, Fantine Mordelet, Paola Vera-Licona, Jean-Philippe Vert

Figure 1 for TIGRESS: Trustful Inference of Gene REgulation using Stability Selection

Figure 2 for TIGRESS: Trustful Inference of Gene REgulation using Stability Selection

Figure 3 for TIGRESS: Trustful Inference of Gene REgulation using Stability Selection

Figure 4 for TIGRESS: Trustful Inference of Gene REgulation using Stability Selection

Abstract:Inferring the structure of gene regulatory networks (GRN) from gene expression data has many applications, from the elucidation of complex biological processes to the identification of potential drug targets. It is however a notoriously difficult problem, for which the many existing methods reach limited accuracy. In this paper, we formulate GRN inference as a sparse regression problem and investigate the performance of a popular feature selection method, least angle regression (LARS) combined with stability selection. We introduce a novel, robust and accurate scoring technique for stability selection, which improves the performance of feature selection with LARS. The resulting method, which we call TIGRESS (Trustful Inference of Gene REgulation using Stability Selection), was ranked among the top methods in the DREAM5 gene network reconstruction challenge. We investigate in depth the influence of the various parameters of the method and show that a fine parameter tuning can lead to significant improvements and state-of-the-art performance for GRN inference. TIGRESS reaches state-of-the-art performance on benchmark data. This study confirms the potential of feature selection techniques for GRN inference. Code and data are available on http://cbio.ensmp.fr/~ahaury. Running TIGRESS online is possible on GenePattern: http://www.broadinstitute.org/cancer/software/genepattern/.

Via

Access Paper or Ask Questions

ProDiGe: PRioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples

Jun 01, 2011

Fantine Mordelet, Jean-Philippe Vert

Figure 1 for ProDiGe: PRioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples

Figure 2 for ProDiGe: PRioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples

Figure 3 for ProDiGe: PRioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples

Figure 4 for ProDiGe: PRioritization Of Disease Genes with multitask machine learning from positive and unlabeled examples

Abstract:Elucidating the genetic basis of human diseases is a central goal of genetics and molecular biology. While traditional linkage analysis and modern high-throughput techniques often provide long lists of tens or hundreds of disease gene candidates, the identification of disease genes among the candidates remains time-consuming and expensive. Efficient computational methods are therefore needed to prioritize genes within the list of candidates, by exploiting the wealth of information available about the genes in various databases. Here we propose ProDiGe, a novel algorithm for Prioritization of Disease Genes. ProDiGe implements a novel machine learning strategy based on learning from positive and unlabeled examples, which allows to integrate various sources of information about the genes, to share information about known disease genes across diseases, and to perform genome-wide searches for new disease genes. Experiments on real data show that ProDiGe outperforms state-of-the-art methods for the prioritization of genes in human diseases.

Via

Access Paper or Ask Questions

A bagging SVM to learn from positive and unlabeled examples

Oct 05, 2010

Fantine Mordelet, Jean-Philippe Vert

Figure 1 for A bagging SVM to learn from positive and unlabeled examples

Figure 2 for A bagging SVM to learn from positive and unlabeled examples

Figure 3 for A bagging SVM to learn from positive and unlabeled examples

Figure 4 for A bagging SVM to learn from positive and unlabeled examples

Abstract:We consider the problem of learning a binary classifier from a training set of positive and unlabeled examples, both in the inductive and in the transductive setting. This problem, often referred to as \emph{PU learning}, differs from the standard supervised classification problem by the lack of negative examples in the training set. It corresponds to an ubiquitous situation in many applications such as information retrieval or gene ranking, when we have identified a set of data of interest sharing a particular property, and we wish to automatically retrieve additional data sharing the same property among a large and easily available pool of unlabeled data. We propose a conceptually simple method, akin to bagging, to approach both inductive and transductive PU learning problems, by converting them into series of supervised binary classification problems discriminating the known positive examples from random subsamples of the unlabeled set. We empirically demonstrate the relevance of the method on simulated and real data, where it performs at least as well as existing methods while being faster.

Via

Access Paper or Ask Questions