Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Learning Arbitrary Statistical Mixtures of Discrete Distributions

Apr 10, 2015

Jian Li, Yuval Rabani, Leonard J. Schulman, Chaitanya Swamy

Share this with someone who'll enjoy it:

Abstract:We study the problem of learning from unlabeled samples very general statistical mixture models on large finite sets. Specifically, the model to be learned, $\vartheta$, is a probability distribution over probability distributions $p$, where each such $p$ is a probability distribution over $[n] = \{1,2,\dots,n\}$. When we sample from $\vartheta$, we do not observe $p$ directly, but only indirectly and in very noisy fashion, by sampling from $[n]$ repeatedly, independently $K$ times from the distribution $p$. The problem is to infer $\vartheta$ to high accuracy in transportation (earthmover) distance. We give the first efficient algorithms for learning this mixture model without making any restricting assumptions on the structure of the distribution $\vartheta$. We bound the quality of the solution as a function of the size of the samples $K$ and the number of samples used. Our model and results have applications to a variety of unsupervised learning scenarios, including learning topic models and collaborative filtering.

* 23 pages. Preliminary version in the Proceeding of the 47th ACM Symposium on the Theory of Computing (STOC15)

View paper on

Share this with someone who'll enjoy it:

Title:Learning Arbitrary Statistical Mixtures of Discrete Distributions

Paper and Code