Abstract:Splits on canned beans appear in the process of preparation and canning. Researchers are studying how they are influenced by cooking environment and genotype. However, there is no existing method to automatically quantify or to characterize the severity of splits. To solve this, we propose two measures: the Bean Split Ratio (BSR) that quantifies the overall severity of splits, and the Bean Split Histogram (BSH) that characterizes the size distribution of splits. We create a pixel-wise segmentation method to automatically estimate these measures from images. We also present a bean dataset of recombinant inbred lines of two genotypes, use the BSR and BSH to assess canning quality, and explore heritability of these properties.
Abstract:We consider multi-response and multitask regression models, where the parameter matrix to be estimated is expected to have an unknown grouping structure. The groupings can be along tasks, or features, or both, the last one indicating a bi-cluster or "checkerboard" structure. Discovering this grouping structure along with parameter inference makes sense in several applications, such as multi-response Genome-Wide Association Studies. This additional structure can not only can be leveraged for more accurate parameter estimation, but it also provides valuable information on the underlying data mechanisms (e.g. relationships among genotypes and phenotypes in GWAS). In this paper, we propose two formulations to simultaneously learn the parameter matrix and its group structures, based on convex regularization penalties. We present optimization approaches to solve the resulting problems and provide numerical convergence guarantees. Our approaches are validated on extensive simulations and real datasets concerning phenotypes and genotypes of plant varieties.