Abstract:Langevin Monte Carlo (LMC) and its stochastic gradient versions are powerful algorithms for sampling from complex high-dimensional distributions. To sample from a distribution with density $\pi(\theta)\propto \exp(-U(\theta)) $, LMC iteratively generates the next sample by taking a step in the gradient direction $\nabla U$ with added Gaussian perturbations. Expectations w.r.t. the target distribution $\pi$ are estimated by averaging over LMC samples. In ordinary Monte Carlo, it is well known that the estimation error can be substantially reduced by replacing independent random samples by quasi-random samples like low-discrepancy sequences. In this work, we show that the estimation error of LMC can also be reduced by using quasi-random samples. Specifically, we propose to use completely uniformly distributed (CUD) sequences with certain low-discrepancy property to generate the Gaussian perturbations. Under smoothness and convexity conditions, we prove that LMC with a low-discrepancy CUD sequence achieves smaller error than standard LMC. The theoretical analysis is supported by compelling numerical experiments, which demonstrate the effectiveness of our approach.
Abstract:Owing to the promising ability of saving hardware cost and spectrum resources, integrated sensing and communication (ISAC) is regarded as a revolutionary technology for future sixth-generation (6G) networks. The mono-static ISAC systems considered in most of existing works can only obtain limited sensing performance due to the single observation angle and easily blocked transmission links, which motivates researchers to investigate cooperative ISAC networks. In order to further improve the degrees of freedom (DoFs) of cooperative ISAC networks, the transmitter-receiver selection, i.e., BS mode selection problem, is meaningful to be studied. However, to our best knowledge, this crucial problem has not been extensively studied in existing works. In this paper, we consider the joint BS mode selection, transmit beamforming, and receive filter design for cooperative cell-free ISAC networks, where multi-base stations (BSs) cooperatively serve communication users and detect targets. We aim to maximize the sum of sensing signal-to-interference-plus-noise ratio (SINR) under the communication SINR requirements, total power budget, and constraints on the numbers of transmitters and receivers. An efficient joint beamforming design algorithm and three different heuristic BS mode selection methods are proposed to solve this non-convex NP-hard problem. Simulation results demonstrates the advantages of cooperative ISAC networks, the importance of BS mode selection, and the effectiveness of our proposed joint design algorithms.
Abstract:Reconfigurable intelligent surface (RIS) is a revolutionary technology for sixth-generation (6G) networks owing to its ability to manipulate wireless environments. As a frequency-selective device, RIS can only effectively shape the propagation of signals within a certain frequency band. Due to this frequency-selective property, the deployment of RIS in cellular networks will introduce a complicated base station (BS)-RIS-user association issue since adjacent BSs operate at different frequency bands. In this paper, with the consideration of the frequency-selective characteristics of RIS, we aim to jointly optimize BS-RIS-user association, active beamforming at BSs, and passive beamforming of RIS to maximize the sum-rate of a RIS-assisted cellular network. We first leverage $l_0$-norm to efficiently integrate BS-RIS-user association with active and passive beamforming. Then, we adopt fractional programming (FP) and block coordinate descent (BCD) methods to deal with logarithmic and fractional parts and decouple the joint association and beamforming design problem into several sub-problems. Efficient algorithms which combine $l_0$-norm approximation, majorization-minimization (MM), and alternating direction method of multipliers (ADMM) are developed to alternately solve the sub-problems. Extensive simulation results illustrate the importance of BS-RIS-user association optimization in RIS-assisted cellular networks and verify the effectiveness of the proposed joint association and beamforming design algorithm.
Abstract:We propose a method for selective inference after a model selection procedure that is potentially a black box. In the conditional post-selection inference framework, a crucial quantity in determining the post-selection distribution of a test statistic is the probability of selecting the model conditional on the statistic. By repeatedly running the model selection procedure on bootstrapped datasets, we can generate training data with binary responses indicating the selection event as well as specially designed covariates, which are then used to learn the selection probability. We prove that the constructed confidence intervals are asymptotically valid if we can learn the selection probability sufficiently well around a neighborhood of the target parameter. The validity of the proposed algorithm is verified by several examples.
Abstract:Reconfigurable intelligent surface (RIS) has been regarded as a revolutionary and promising technology owing to its powerful feature of adaptively shaping wireless propagation environment. However, as a frequency-selective device, the RIS can only effectively provide tunable phase-shifts for signals within a certain frequency band. Thus, base-station (BS)-RIS-user association is an important issue to maximize the efficiency and ability of the RIS in cellular networks. In this paper, we consider a RIS-aided cellular network and aim to maximize the sum-rate of downlink transmissions by designing BS-RIS-user association as well as the active and passive beamforming of BSs and RIS, respectively. A dynamically successive access algorithm is developed to design the user association. During the dynamical access process, an iterative algorithm is proposed to alternatively obtain the active and passive beamforming. Finally, the optimal BS-RIS association is obtained by an exhaustive search method. Simulation results illustrate the significant performance improvement of the proposed BS-RIS-user association and beamforming design algorithm.
Abstract:Many machine learning problems optimize an objective that must be measured with noise. The primary method is a first order stochastic gradient descent using one or more Monte Carlo (MC) samples at each step. There are settings where ill-conditioning makes second order methods such as L-BFGS more effective. We study the use of randomized quasi-Monte Carlo (RQMC) sampling for such problems. When MC sampling has a root mean squared error (RMSE) of $O(n^{-1/2})$ then RQMC has an RMSE of $o(n^{-1/2})$ that can be close to $O(n^{-3/2})$ in favorable settings. We prove that improved sampling accuracy translates directly to improved optimization. In our empirical investigations for variational Bayes, using RQMC with stochastic L-BFGS greatly speeds up the optimization, and sometimes finds a better parameter value than MC does.
Abstract:In network applications, it has become increasingly common to obtain datasets in the form of multiple networks observed on the same set of subjects, where each network is obtained in a related but different experiment condition or application scenario. Such datasets can be modeled by multilayer networks where each layer is a separate network itself while different layers are associated and share some common information. The present paper studies community detection in a stylized yet informative inhomogeneous multilayer network model. In our model, layers are generated by different stochastic block models, the community structures of which are (random) perturbations of a common global structure while the connecting probabilities in different layers are not related. Focusing on the symmetric two block case, we establish minimax rates for both \emph{global estimation} of the common structure and \emph{individualized estimation} of layer-wise community structures. Both minimax rates have sharp exponents. In addition, we provide an efficient algorithm that is simultaneously asymptotic minimax optimal for both estimation tasks under mild conditions. The optimal rates depend on the \emph{parity} of the number of most informative layers, a phenomenon that is caused by inhomogeneity across layers.
Abstract:We participate in the WMT 2020 shared news translation task on Chinese to English. Our system is based on the Transformer (Vaswani et al., 2017a) with effective variants and the DTMT (Meng and Zhang, 2019) architecture. In our experiments, we employ data selection, several synthetic data generation approaches (i.e., back-translation, knowledge distillation, and iterative in-domain knowledge transfer), advanced finetuning approaches and self-bleu based model ensemble. Our constrained Chinese to English system achieves 36.9 case-sensitive BLEU score, which is the highest among all submissions.
Abstract:We provide an exact analysis of the limiting spectrum of matrices randomly projected either with the subsampled randomized Hadamard transform, or truncated Haar matrices. We characterize this limiting distribution through its Stieltjes transform, a classical object in random matrix theory, and compute the first and second inverse moments. We leverage the limiting spectrum and asymptotic freeness of random matrices to obtain an exact analysis of iterative sketching methods for solving least squares problems. Our results also yield optimal step-sizes and convergence rates in terms of simple closed-form expressions. Moreover, we show that the convergence rate for Haar and randomized Hadamard matrices are identical, and uniformly improve upon Gaussian random projections. The developed techniques and formulas can be applied to a plethora of randomized algorithms that employ fast randomized Hadamard dimension reduction.
Abstract:Large datasets create opportunities as well as analytic challenges. A recent development is to use random projection or sketching methods for dimension reduction in statistics and machine learning. In this work, we study the statistical performance of sketching algorithms for linear regression. Suppose we randomly project the data matrix and the outcome using a random sketching matrix reducing the sample size, and do linear regression on the resulting data. How much do we lose compared to the original linear regression? The existing theory does not give a precise enough answer, and this has been a bottleneck for using random projections in practice. In this paper, we introduce a new mathematical approach to the problem, relying on very recent results from asymptotic random matrix theory and free probability theory. This is a perfect fit, as the sketching matrices are random in practice. We allow the dimension and sample sizes to have an arbitrary ratio. We study the most popular sketching methods in a unified framework, including random projection methods (Gaussian and iid projections, uniform orthogonal projections, subsampled randomized Hadamard transforms), as well as sampling methods (including uniform, leverage-based, and greedy sampling). We find precise and simple expressions for the accuracy loss of these methods. These go beyond classical Johnson-Lindenstrauss type results, because they are exact, instead of being bounds up to constants. Our theoretical formulas are surprisingly accurate in extensive simulations and on two empirical datasets.