Abstract:We present a practical method to audit the differential privacy (DP) guarantees of a machine learning model using a small hold-out dataset that is not exposed to the model during the training. Having a score function such as the loss function employed during the training, our method estimates the total variation (TV) distance between scores obtained with a subset of the training data and the hold-out dataset. With some meta information about the underlying DP training algorithm, these TV distance values can be converted to $(\varepsilon,\delta)$-guarantees for any $\delta$. We show that these score distributions asymptotically give lower bounds for the DP guarantees of the underlying training algorithm, however, we perform a one-shot estimation for practicality reasons. We specify conditions that lead to lower bounds for the DP guarantees with high probability. To estimate the TV distance between the score distributions, we use a simple density estimation method based on histograms. We show that the TV distance gives a very close to optimally robust estimator and has an error rate $\mathcal{O}(k^{-1/3})$, where $k$ is the total number of samples. Numerical experiments on benchmark datasets illustrate the effectiveness of our approach and show improvements over baseline methods for black-box auditing.
Abstract:The large untapped spectrum in the sub-THz allows for ultra-high throughput communication to realize many seemingly impossible applications in 6G. One of the challenges in radio communications in sub-THz is the hardware impairments. Specifically, phase noise is one key hardware impairment, which is accentuated as we increase the frequency and bandwidth. Furthermore, the modest output power of the sub-THz power amplifier demands limits on peak to average power ratio (PAPR) signal design. Single carrier frequency domain equalization (SC-FDE) waveform has been identified as a suitable candidate for sub-THz, although some challenges such as phase noise and PAPR still remain to be tackled. In this work, we design a phase noise robust, low PAPR SC-FDE waveform by geometrically shaping the constellation under practical conditions. We formulate the waveform optimization problem in its augmented Lagrangian form and use a back-propagation-inspired technique to obtain a constellation design that is numerically robust to phase noise, while maintaining a low PAPR.
Abstract:Transmit antenna muting (TAM) in multiple-user multiple-input multiple-output (MU-MIMO) networks allows reducing the power consumption of the base station (BS) by properly utilizing only a subset of antennas in the BS. In this paper, we consider the downlink transmission of an MU-MIMO network where TAM is formulated to minimize the number of active antennas in the BS while guaranteeing the per-user throughput requirements. To address the computational complexity of the combinatorial optimization problem, we propose an algorithm called neural antenna muting (NAM) with an asymmetric custom loss function. NAM is a classification neural network trained in a supervised manner. The classification error in this scheme leads to either sub-optimal energy consumption or lower quality of service (QoS) for the communication link. We control the classification error probability distribution by designing an asymmetric loss function such that the erroneous classification outputs are more likely to result in fulfilling the QoS requirements. Furthermore, we present three heuristic algorithms and compare them with the NAM. Using a 3GPP compliant system-level simulator, we show that NAM achieves $\sim73\%$ energy saving compared to the full antenna configuration in the BS with $\sim95\%$ reliability in achieving the user throughput requirements while being around $1000\times$ and $24\times$ less computationally intensive than the greedy heuristic algorithm and the fixed column antenna muting algorithm, respectively.
Abstract:In wireless communications systems, the user equipment (UE) transmits a random access preamble sequence to the base station (BS) to be detected and synchronized. In standardized cellular communications systems Zadoff-Chu sequences has been proposed due to their constant amplitude zero autocorrelation (CAZAC) properties. The conventional approach is to use matched filters to detect the sequence. Sequences arrived from different antennas and time instances are summed up to reduce the noise variance. Since the knowledge of the channel is unknown at this stage, a coherent combining scheme would be very difficult to implement. In this work, we leverage the system design knowledge and propose a neural network (NN) sequence detector and timing advanced estimator. We do not replace the whole process of preamble detection by a NN. Instead, we propose to use NN only for \textit{blind} coherent combining of the signals in the detector to compensate for the channel effect, thus maximize the signal to noise ratio. We have further reduced the problem's complexity using Kronecker approximation model for channel covariance matrices, thereby, reducing the size of required NN. The analysis on timing advanced estimation and sequences detection has been performed and compared with the matched filter baseline.
Abstract:The strict latency and reliability requirements of ultra-reliable low-latency communications (URLLC) use cases are among the main drivers in fifth generation (5G) network design. Link adaptation (LA) is considered to be one of the bottlenecks to realize URLLC. In this paper, we focus on predicting the signal to interference plus noise ratio at the user to enhance the LA. Motivated by the fact that most of the URLLC use cases with most extreme latency and reliability requirements are characterized by semi-deterministic traffic, we propose to exploit the time correlation of the interference to compute useful statistics needed to predict the interference power in the next transmission. This prediction is exploited in the LA context to maximize the spectral efficiency while guaranteeing reliability at an arbitrary level. Numerical results are compared with state of the art interference prediction techniques for LA. We show that exploiting time correlation of the interference is an important enabler of URLLC.
Abstract:Blind source separation algorithms such as independent component analysis (ICA) are widely used in the analysis of neuroimaging data. In order to leverage larger sample sizes, different data holders/sites may wish to collaboratively learn feature representations. However, such datasets are often privacy-sensitive, precluding centralized analyses that pool the data at a single site. A recently proposed algorithm uses message-passing between sites and a central aggregator to perform a decentralized joint ICA (djICA) without sharing the data. However, this method does not satisfy formal privacy guarantees. We propose a differentially private algorithm for performing ICA in a decentralized data setting. Differential privacy provides a formal and mathematically rigorous privacy guarantee by introducing noise into the messages. Conventional approaches to decentralized differentially private algorithms may require too much noise due to the typically small sample sizes at each site. We leverage a recently proposed correlated noise protocol to remedy the excessive noise problem of the conventional schemes. We investigate the performance of the proposed algorithm on synthetic and real fMRI datasets to show that our algorithm outperforms existing approaches and can sometimes reach the same level of utility as the corresponding non-private algorithm. This indicates that it is possible to have meaningful utility while preserving privacy.
Abstract:Many applications of machine learning, such as human health research, involve processing private or sensitive information. Privacy concerns may impose significant hurdles to collaboration in scenarios where there are multiple sites holding data and the goal is to estimate properties jointly across all datasets. Differentially private decentralized algorithms can provide strong privacy guarantees. However, the accuracy of the joint estimates may be poor when the datasets at each site are small. This paper proposes a new framework, Correlation Assisted Private Estimation (CAPE), for designing privacy-preserving decentralized algorithms with better accuracy guarantees in an honest-but-curious model. CAPE can be used in conjunction with the functional mechanism for statistical and machine learning optimization problems. A tighter characterization of the functional mechanism is provided that allows CAPE to achieve the same performance as a centralized algorithm in the decentralized setting using all datasets. Empirical results on regression and neural network problems for both synthetic and real datasets show that differentially private methods can be competitive with non-private algorithms in many scenarios of interest.