Abstract:Economic growth results from countries' accumulation of organizational and technological capabilities. The Economic and Product Complexity Indices, introduced as an attempt to measure these capabilities from a country's basket of exported products, have become popular to study economic development, the geography of innovation, and industrial policies. Despite this reception, the interpretation of these indicators proved difficult. Although the original Method of Reflections suggested a direct interconnection between country and product metrics, it has been proved that the Economic and Product Complexity Indices result from a spectral clustering algorithm that separately groups similar countries or similar products, respectively. This recent approach to economic and product complexity conflicts with the original one and treats separately countries and products. However, building on previous interpretations of the indices and the recent evolution in spectral clustering, we show that these indices simultaneously identify two co-clusters of similar countries and products. This viewpoint reconciles the spectral clustering interpretation of the indices with the original Method of Reflections interpretation. By proving the often neglected intimate relationship between country and product complexity, this approach emphasizes the role of a selected set of products in determining economic development while extending the range of applications of these indicators in economics.
Abstract:Nowadays, more and more problems are dealing with data with one infinite continuous dimension: functional data. In this paper, we introduce the funLOCI algorithm which allows to identify functional local clusters or functional loci, i.e., subsets/groups of functions exhibiting similar behaviour across the same continuous subset of the domain. The definition of functional local clusters leverages ideas from multivariate and functional clustering and biclustering and it is based on an additive model which takes into account the shape of the curves. funLOCI is a three-step algorithm based on divisive hierarchical clustering. The use of dendrograms allows to visualize and to guide the searching procedure and the cutting thresholds selection. To deal with the large quantity of local clusters, an extra step is implemented to reduce the number of results to the minimum.
Abstract:In the last two decades several biclustering methods have been developed as new unsupervised learning techniques to simultaneously cluster rows and columns of a data matrix. These algorithms play a central role in contemporary machine learning and in many applications, e.g. to computational biology and bioinformatics. The H-score is the evaluation score underlying the seminal biclustering algorithm by Cheng and Church, as well as many other subsequent biclustering methods. In this paper, we characterize a potentially troublesome bias in this score, that can distort biclustering results. We prove, both analytically and by simulation, that the average H-score increases with the number of rows/columns in a bicluster. This makes the H-score, and hence all algorithms based on it, biased towards small clusters. Based on our analytical proof, we are able to provide a straightforward way to correct this bias, allowing users to accurately compare biclusters.