Abstract:Large graphs abound in machine learning, data mining, and several related areas. A useful step towards analyzing such graphs is that of obtaining certain summary statistics - e.g., or the expected length of a shortest path between two nodes, or the expected weight of a minimum spanning tree of the graph, etc. These statistics provide insight into the structure of a graph, and they can help predict global properties of a graph. Motivated thus, we propose to study statistical properties of structured subgraphs (of a given graph), in particular, to estimate the expected objective function value of a combinatorial optimization problem over these subgraphs. The general task is very difficult, if not unsolvable; so for concreteness we describe a more specific statistical estimation problem based on spanning trees. We hope that our position paper encourages others to also study other types of graphical structures for which one can prove nontrivial statistical estimates.
Abstract:Cryo-electron microscopy (cryo-EM) is an emerging experimental method to characterize the structure of large biomolecular assemblies. Single particle cryo-EM records 2D images (so-called micrographs) of projections of the three-dimensional particle, which need to be processed to obtain the three-dimensional reconstruction. A crucial step in the reconstruction process is particle picking which involves detection of particles in noisy 2D micrographs with low signal-to-noise ratios of typically 1:10 or even lower. Typically, each picture contains a large number of particles, and particles have unknown irregular and nonconvex shapes.
Abstract:We develop a novel method for detection of signals and reconstruction of images in the presence of random noise. The method uses results from percolation theory. We specifically address the problem of detection of multiple objects of unknown shapes in the case of nonparametric noise. The noise density is unknown and can be heavy-tailed. The objects of interest have unknown varying intensities. No boundary shape constraints are imposed on the objects, only a set of weak bulk conditions is required. We view the object detection problem as a multiple hypothesis testing for discrete statistical inverse problems. We present an algorithm that allows to detect greyscale objects of various shapes in noisy images. We prove results on consistency and algorithmic complexity of our procedures. Applications to cryo-electron microscopy are presented.