Abstract:Building on recent studies of large-dimensional kernel regression, particularly those involving inner product kernels on the sphere $\mathbb{S}^{d}$, we investigate the Pinsker bound for inner product kernel regression in such settings. Specifically, we address the scenario where the sample size $n$ is given by $\alpha d^{\gamma}(1+o_{d}(1))$ for some $\alpha, \gamma>0$. We have determined the exact minimax risk for kernel regression in this setting, not only identifying the minimax rate but also the exact constant, known as the Pinsker constant, associated with the excess risk.
Abstract:The saturation effect refers to the phenomenon that the kernel ridge regression (KRR) fails to achieve the information theoretical lower bound when the smoothness of the underground truth function exceeds certain level. The saturation effect has been widely observed in practices and a saturation lower bound of KRR has been conjectured for decades. In this paper, we provide a proof of this long-standing conjecture.
Abstract:The generalization ability of kernel interpolation in large dimensions (i.e., $n \asymp d^{\gamma}$ for some $\gamma>0$) might be one of the most interesting problems in the recent renaissance of kernel regression, since it may help us understand the 'benign overfitting phenomenon' reported in the neural networks literature. Focusing on the inner product kernel on the sphere, we fully characterized the exact order of both the variance and bias of large-dimensional kernel interpolation under various source conditions $s\geq 0$. Consequently, we obtained the $(s,\gamma)$-phase diagram of large-dimensional kernel interpolation, i.e., we determined the regions in $(s,\gamma)$-plane where the kernel interpolation is minimax optimal, sub-optimal and inconsistent.
Abstract:Multi-target detection is one of the primary tasks in radar-based localization and sensing, typically built on phased array antennas. However, the bulky hardware in the phased array restricts its potential for enhancing detection accuracy, since the cost and power of the phased array can become unaffordable as its physical aperture scales up to pursue higher beam shaping capabilities. To resolve this issue, we propose a radar system enabled by reconfigurable holographic surfaces (RHSs), a novel meta-surface antenna composed of meta-material elements with cost-effective and power-efficient hardware, which performs multi-target detection in an adaptive manner. Different from the phase-control structure in the phased array, the RHS is able to apply beamforming by controlling the radiation amplitudes of its elements. Consequently, traditional beamforming schemes designed for phased arrays cannot be directly applied to RHSs due to this structural difference. To tackle this challenge, a waveform and amplitude optimization algorithm (WAOA) is designed to jointly optimize the radar waveform and RHS amplitudes in order to improve the detection accuracy. Simulation results reveal that the proposed RHS-enabled radar increases the probability of detection by 0.13 compared to phased array radars when six iterations of adaptive detection are performed given the same hardware cost.
Abstract:As a crucial facilitator of future autonomous driving applications, wireless simultaneous localization and mapping (SLAM) has drawn growing attention recently. However, the accuracy of existing wireless SLAM schemes is limited because the antenna gain is constrained given the cost budget due to the expensive hardware components such as phase arrays. To address this issue, we propose a reconfigurable holographic surface (RHS)-aided SLAM system in this paper. The RHS is a novel type of low-cost antenna that can cut down the hardware cost by replacing phased arrays in conventional SLAM systems. However, compared with a phased array where the phase shifts of parallelfed signals are adjusted, the RHS exhibits a different radiation model because its amplitude-controlled radiation elements are series-fed by surface waves, implying that traditional schemes cannot be applied directly. To address this challenge, we propose an RHS-aided beam steering method for sensing the surrounding environment and design the corresponding SLAM algorithm. Simulation results show that the proposed scheme can achieve more than there times the localization accuracy that traditional wireless SLAM with the same cost achieves.
Abstract:Localization which uses holographic multiple input multiple output surface such as reconfigurable intelligent surface (RIS) has gained increasing attention due to its ability to accurately localize users in non-line-of-sight conditions. However, existing RIS-enabled localization methods assume the users at either the near-field (NF) or the far-field (FF) region, which results in high complexity or low localization accuracy, respectively, when they are applied in the whole area. In this paper, a unified NF and FF localization method is proposed for the RIS-enabled localization system to overcome the above issue. Specifically, the NF and FF regions are both divided into grids. The RIS reflects the signals from the user to the base station~(BS), and then the BS uses the received signals to determine the grid where the user is located. Compared with existing NF- or FF-only schemes, the design of the location estimation method and the RIS phase shift optimization algorithm is more challenging because they are based on a hybrid NF and FF model. To tackle these challenges, we formulate the optimization problems for location estimation and RIS phase shifts, and design two algorithms to effectively solve the formulated problems, respectively. The effectiveness of the proposed method is verified through simulations.
Abstract:Motivated by the studies of neural networks (e.g.,the neural tangent kernel theory), we perform a study on the large-dimensional behavior of kernel ridge regression (KRR) where the sample size $n \asymp d^{\gamma}$ for some $\gamma > 0$. Given an RKHS $\mathcal{H}$ associated with an inner product kernel defined on the sphere $\mathbb{S}^{d}$, we suppose that the true function $f_{\rho}^{*} \in [\mathcal{H}]^{s}$, the interpolation space of $\mathcal{H}$ with source condition $s>0$. We first determined the exact order (both upper and lower bound) of the generalization error of kernel ridge regression for the optimally chosen regularization parameter $\lambda$. We then further showed that when $0<s\le1$, KRR is minimax optimal; and when $s>1$, KRR is not minimax optimal (a.k.a. he saturation effect). Our results illustrate that the curves of rate varying along $\gamma$ exhibit the periodic plateau behavior and the multiple descent behavior and show how the curves evolve with $s>0$. Interestingly, our work provides a unified viewpoint of several recent works on kernel regression in the large-dimensional setting, which correspond to $s=0$ and $s=1$ respectively.
Abstract:The widely observed 'benign overfitting phenomenon' in the neural network literature raises the challenge to the 'bias-variance trade-off' doctrine in the statistical learning theory. Since the generalization ability of the 'lazy trained' over-parametrized neural network can be well approximated by that of the neural tangent kernel regression, the curve of the excess risk (namely, the learning curve) of kernel ridge regression attracts increasing attention recently. However, most recent arguments on the learning curve are heuristic and are based on the 'Gaussian design' assumption. In this paper, under mild and more realistic assumptions, we rigorously provide a full characterization of the learning curve: elaborating the effect and the interplay of the choice of the regularization parameter, the source condition and the noise. In particular, our results suggest that the 'benign overfitting phenomenon' exists in very wide neural networks only when the noise level is small.
Abstract:Deep Gradient Leakage (DGL) is a highly effective attack that recovers private training images from gradient vectors. This attack casts significant privacy challenges on distributed learning from clients with sensitive data, where clients are required to share gradients. Defending against such attacks requires but lacks an understanding of when and how privacy leakage happens, mostly because of the black-box nature of deep networks. In this paper, we propose a novel Inversion Influence Function (I$^2$F) that establishes a closed-form connection between the recovered images and the private gradients by implicitly solving the DGL problem. Compared to directly solving DGL, I$^2$F is scalable for analyzing deep networks, requiring only oracle access to gradients and Jacobian-vector products. We empirically demonstrate that I$^2$F effectively approximated the DGL generally on different model architectures, datasets, attack implementations, and noise-based defenses. With this novel tool, we provide insights into effective gradient perturbation directions, the unfairness of privacy protection, and privacy-preferred model initialization. Our codes are provided in https://github.com/illidanlab/inversion-influence-function.
Abstract:We perform a study on kernel regression for large-dimensional data (where the sample size $n$ is polynomially depending on the dimension $d$ of the samples, i.e., $n\asymp d^{\gamma}$ for some $\gamma >0$ ). We first build a general tool to characterize the upper bound and the minimax lower bound of kernel regression for large dimensional data through the Mendelson complexity $\varepsilon_{n}^{2}$ and the metric entropy $\bar{\varepsilon}_{n}^{2}$ respectively. When the target function falls into the RKHS associated with a (general) inner product model defined on $\mathbb{S}^{d}$, we utilize the new tool to show that the minimax rate of the excess risk of kernel regression is $n^{-1/2}$ when $n\asymp d^{\gamma}$ for $\gamma =2, 4, 6, 8, \cdots$. We then further determine the optimal rate of the excess risk of kernel regression for all the $\gamma>0$ and find that the curve of optimal rate varying along $\gamma$ exhibits several new phenomena including the {\it multiple descent behavior} and the {\it periodic plateau behavior}. As an application, For the neural tangent kernel (NTK), we also provide a similar explicit description of the curve of optimal rate. As a direct corollary, we know these claims hold for wide neural networks as well.