Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Changlin Wan

A Multi-Layer Regression based Predicable Function Fitting Network

Sep 19, 2022

Changlin Wan, Zhongzhi Shi

Figure 1 for A Multi-Layer Regression based Predicable Function Fitting Network

Figure 2 for A Multi-Layer Regression based Predicable Function Fitting Network

Figure 3 for A Multi-Layer Regression based Predicable Function Fitting Network

Figure 4 for A Multi-Layer Regression based Predicable Function Fitting Network

Abstract:Function plays an important role in mathematics and many science branches. As the fast development of computer technology, more and more study on computational function analysis, e.g., Fast Fourier Transform, Wavelet Transform, Curve Function, are presented in these years. However, there are two main problems in these approaches: 1) hard to handle the complex functions of stationary and non-stationary, periodic and non-periodic, high order and low order; 2) hard to generalize the fitting functions from training data to test data. In this paper, a multiple regression based function fitting network that solves the two main problems is introduced as a predicable function fitting technique. This technique constructs the network includes three main parts: 1) the stationary transform layer, 2) the feature encoding layers, and 3) the fine tuning regression layer. The stationary transform layer recognizes the order of input function data, and transforms non-stationary function to stationary function. The feature encoding layers encode the raw input sequential data to a novel linear regression feature that can capture both the structural and the temporal characters of the sequential data. The fine tuning regression layer then fits the features to the target ahead values. The fitting network with the linear regression feature layers and a non-linear regression layer come up with high quality fitting results and generalizable predictions. The experiments of both mathematic function examples and the real word function examples verifies the efficiency of the proposed technique.

* 14 pages, 3 figures

Via

Access Paper or Ask Questions

Spatially and Robustly Hybrid Mixture Regression Model for Inference of Spatial Dependence

Sep 28, 2021

Wennan Chang, Pengtao Dang, Changlin Wan, Xiaoyu Lu, Yue Fang, Tong Zhao, Yong Zang, Bo Li, Chi Zhang, Sha Cao

Figure 1 for Spatially and Robustly Hybrid Mixture Regression Model for Inference of Spatial Dependence

Figure 2 for Spatially and Robustly Hybrid Mixture Regression Model for Inference of Spatial Dependence

Abstract:In this paper, we propose a Spatial Robust Mixture Regression model to investigate the relationship between a response variable and a set of explanatory variables over the spatial domain, assuming that the relationships may exhibit complex spatially dynamic patterns that cannot be captured by constant regression coefficients. Our method integrates the robust finite mixture Gaussian regression model with spatial constraints, to simultaneously handle the spatial nonstationarity, local homogeneity, and outlier contaminations. Compared with existing spatial regression models, our proposed model assumes the existence a few distinct regression models that are estimated based on observations that exhibit similar response-predictor relationships. As such, the proposed model not only accounts for nonstationarity in the spatial trend, but also clusters observations into a few distinct and homogenous groups. This provides an advantage on interpretation with a few stationary sub-processes identified that capture the predominant relationships between response and predictor variables. Moreover, the proposed method incorporates robust procedures to handle contaminations from both regression outliers and spatial outliers. By doing so, we robustly segment the spatial domain into distinct local regions with similar regression coefficients, and sporadic locations that are purely outliers. Rigorous statistical hypothesis testing procedure has been designed to test the significance of such segmentation. Experimental results on many synthetic and real-world datasets demonstrate the robustness, accuracy, and effectiveness of our proposed method, compared with other robust finite mixture regression, spatial regression and spatial segmentation methods.

* Accepted by ICDM IEEE 2021

Via

Access Paper or Ask Questions

Principled Hyperedge Prediction with Structural Spectral Features and Neural Networks

Jun 13, 2021

Changlin Wan, Muhan Zhang, Wei Hao, Sha Cao, Pan Li, Chi Zhang

Figure 1 for Principled Hyperedge Prediction with Structural Spectral Features and Neural Networks

Figure 2 for Principled Hyperedge Prediction with Structural Spectral Features and Neural Networks

Figure 3 for Principled Hyperedge Prediction with Structural Spectral Features and Neural Networks

Figure 4 for Principled Hyperedge Prediction with Structural Spectral Features and Neural Networks

Abstract:Hypergraph offers a framework to depict the multilateral relationships in real-world complex data. Predicting higher-order relationships, i.e hyperedge, becomes a fundamental problem for the full understanding of complicated interactions. The development of graph neural network (GNN) has greatly advanced the analysis of ordinary graphs with pair-wise relations. However, these methods could not be easily extended to the case of hypergraph. In this paper, we generalize the challenges of GNN in representing higher-order data in principle, which are edge- and node-level ambiguities. To overcome the challenges, we present SNALS that utilizes bipartite graph neural network with structural features to collectively tackle the two ambiguity issues. SNALS captures the joint interactions of a hyperedge by its local environment, which is retrieved by collecting the spectrum information of their connections. As a result, SNALS achieves nearly 30% performance increase compared with most recent GNN-based models. In addition, we applied SNALS to predict genetic higher-order interactions on 3D genome organization data. SNALS showed consistently high prediction accuracy across different chromosomes, and generated novel findings on 4-way gene interaction, which is further validated by existing literature.

Via

Access Paper or Ask Questions

Denoising individual bias for a fairer binary submatrix detection

Aug 09, 2020

Changlin Wan, Wennan Chang, Tong Zhao, Sha Cao, Chi Zhang

Figure 1 for Denoising individual bias for a fairer binary submatrix detection

Figure 2 for Denoising individual bias for a fairer binary submatrix detection

Figure 3 for Denoising individual bias for a fairer binary submatrix detection

Figure 4 for Denoising individual bias for a fairer binary submatrix detection

Abstract:Low rank representation of binary matrix is powerful in disentangling sparse individual-attribute associations, and has received wide applications. Existing binary matrix factorization (BMF) or co-clustering (CC) methods often assume i.i.d background noise. However, this assumption could be easily violated in real data, where heterogeneous row- or column-wise probability of binary entries results in disparate element-wise background distribution, and paralyzes the rationality of existing methods. We propose a binary data denoising framework, namely BIND, which optimizes the detection of true patterns by estimating the row- or column-wise mixture distribution of patterns and disparate background, and eliminating the binary attributes that are more likely from the background. BIND is supported by thoroughly derived mathematical property of the row- and column-wise mixture distributions. Our experiment on synthetic and real-world data demonstrated BIND effectively removes background noise and drastically increases the fairness and accuracy of state-of-the arts BMF and CC methods.

* Accepted at CIKM 2020

Via

Access Paper or Ask Questions

Geometric All-Way Boolean Tensor Decomposition

Jul 31, 2020

Changlin Wan, Wennan Chang, Tong Zhao, Sha Cao, Chi Zhang

Figure 1 for Geometric All-Way Boolean Tensor Decomposition

Figure 2 for Geometric All-Way Boolean Tensor Decomposition

Figure 3 for Geometric All-Way Boolean Tensor Decomposition

Figure 4 for Geometric All-Way Boolean Tensor Decomposition

Abstract:Boolean tensor has been broadly utilized in representing high dimensional logical data collected on spatial, temporal and/or other relational domains. Boolean Tensor Decomposition (BTD) factorizes a binary tensor into the Boolean sum of multiple rank-1 tensors, which is an NP-hard problem. Existing BTD methods have been limited by their high computational cost, in applications to large scale or higher order tensors. In this work, we presented a computationally efficient BTD algorithm, namely \textit{Geometric Expansion for all-order Tensor Factorization} (GETF), that sequentially identifies the rank-1 basis components for a tensor from a geometric perspective. We conducted rigorous theoretical analysis on the validity as well as algorithemic efficiency of GETF in decomposing all-order tensor. Experiments on both synthetic and real-world data demonstrated that GETF has significantly improved performance in reconstruction accuracy, extraction of latent structures and it is an order of magnitude faster than other state-of-the-art methods.

Via

Access Paper or Ask Questions

Supervised clustering of high dimensional data using regularized mixture modeling

Jul 19, 2020

Wennan Chang, Changlin Wan, Yong Zang, Chi Zhang, Sha Cao

Figure 1 for Supervised clustering of high dimensional data using regularized mixture modeling

Figure 2 for Supervised clustering of high dimensional data using regularized mixture modeling

Figure 3 for Supervised clustering of high dimensional data using regularized mixture modeling

Figure 4 for Supervised clustering of high dimensional data using regularized mixture modeling

Abstract:Identifying relationships between molecular variations and their clinical presentations has been challenged by the heterogeneous causes of a disease. It is imperative to unveil the relationship between the high dimensional molecular manifestations and the clinical presentations, while taking into account the possible heterogeneity of the study subjects. We proposed a novel supervised clustering algorithm using penalized mixture regression model, called CSMR, to deal with the challenges in studying the heterogeneous relationships between high dimensional molecular features to a phenotype. The algorithm was adapted from the classification expectation maximization algorithm, which offers a novel supervised solution to the clustering problem, with substantial improvement on both the computational efficiency and biological interpretability. Experimental evaluation on simulated benchmark datasets demonstrated that the CSMR can accurately identify the subspaces on which subset of features are explanatory to the response variables, and it outperformed the baseline methods. Application of CSMR on a drug sensitivity dataset again demonstrated the superior performance of CSMR over the others, where CSMR is powerful in recapitulating the distinct subgroups hidden in the pool of cell lines with regards to their coping mechanisms to different drugs. CSMR represents a big data analysis tool with the potential to resolve the complexity of translating the clinical manifestations of the disease to the real causes underpinning it. We believe that it will bring new understanding to the molecular basis of a disease, and could be of special relevance in the growing field of personalized medicine.

Via

Access Paper or Ask Questions

MEBF: a fast and efficient Boolean matrix factorization method

Sep 09, 2019

Changlin Wan, Wennan Chang, Tong Zhao, Mengya Li, Sha Cao, Chi Zhang

Figure 1 for MEBF: a fast and efficient Boolean matrix factorization method

Figure 2 for MEBF: a fast and efficient Boolean matrix factorization method

Figure 3 for MEBF: a fast and efficient Boolean matrix factorization method

Figure 4 for MEBF: a fast and efficient Boolean matrix factorization method

Abstract:Boolean matrix has been used to represent digital information in many fields, including bank transaction, crime records, natural language processing, protein-protein interaction, etc. Boolean matrix factorization (BMF) aims to find an approximation of a binary matrix as the Boolean product of two low rank Boolean matrices, which could generate vast amount of information for the patterns of relationships between the features and samples. Inspired by binary matrix permutation theories and geometric segmentation, we developed a fast and efficient BMF approach called MEBF (Median Expansion for Boolean Factorization). Overall, MEBF adopted a heuristic approach to locate binary patterns presented as submatrices that are dense in 1's. At each iteration, MEBF permutates the rows and columns such that the permutated matrix is approximately Upper Triangular-Like (UTL) with so-called Simultaneous Consecutive-ones Property (SC1P). The largest submatrix dense in 1 would lies on the upper triangular area of the permutated matrix, and its location was determined based on a geometric segmentation of a triangular. We compared MEBF with other state of the art approaches on data scenarios with different sparsity and noise levels. MEBF demonstrated superior performances in lower reconstruction error, and higher computational efficiency, as well as more accurate sparse patterns than popular methods such as ASSO, PANDA and MP. We demonstrated the application of MEBF on both binary and non-binary data sets, and revealed its further potential in knowledge retrieving and data denoising.

Via

Access Paper or Ask Questions