Shitz
Abstract:Inspired by biological processes, neuromorphic computing utilizes spiking neural networks (SNNs) to perform inference tasks, offering significant efficiency gains for workloads involving sequential data. Recent advances in hardware and software have demonstrated that embedding a few bits of payload in each spike exchanged between the spiking neurons can further enhance inference accuracy. In a split computing architecture, where the SNN is divided across two separate devices, the device storing the first layers must share information about the spikes generated by the local output neurons with the other device. Consequently, the advantages of multi-level spikes must be balanced against the challenges of transmitting additional bits between the two devices. This paper addresses these challenges by investigating a wireless neuromorphic split computing architecture employing multi-level SNNs. For this system, we present the design of digital and analog modulation schemes optimized for an orthogonal frequency division multiplexing (OFDM) radio interface. Simulation and experimental results using software-defined radios provide insights into the performance gains of multi-level SNN models and the optimal payload size as a function of the quality of the connection between a transmitter and receiver.
Abstract:This work investigates a collaborative sensing and data collection system in which multiple unmanned aerial vehicles (UAVs) sense an area of interest and transmit images to a cloud server (CS) for processing. To accelerate the completion of sensing missions, including data transmission, the sensing task is divided into individual private sensing tasks for each UAV and a common sensing task that is executed by all UAVs to enable cooperative transmission. Unlike existing studies, we explore the use of an advanced cell-free multiple-input multiple-output (MIMO) network, which effectively manages inter-UAV interference. To further optimize wireless channel utilization, we propose a hybrid transmission strategy that combines time-division multiple access (TDMA), non-orthogonal multiple access (NOMA), and cooperative transmission. The problem of jointly optimizing task splitting ratios and the hybrid TDMA-NOMA-cooperative transmission strategy is formulated with the objective of minimizing mission completion time. Extensive numerical results demonstrate the effectiveness of the proposed task allocation and hybrid transmission scheme in accelerating the completion of sensing missions.
Abstract:Sequence models have demonstrated the ability to perform tasks like channel equalization and symbol detection by automatically adapting to current channel conditions. This is done without requiring any explicit optimization and by leveraging not only short pilot sequences but also contextual information such as long-term channel statistics. The operating principle underlying automatic adaptation is in-context learning (ICL), an emerging property of sequence models. Prior art adopted transformer-based sequence models, which, however, have a computational complexity scaling quadratically with the context length due to batch processing. Recently, state-space models (SSMs) have emerged as a more efficient alternative, affording a linear inference complexity in the context size. This work explores the potential of SSMs for ICL-based equalization in cell-free massive MIMO systems. Results show that selective SSMs achieve comparable performance to transformer-based models while requiring approximately eight times fewer parameters and five times fewer floating-point operations.
Abstract:To support real-world decision-making, it is crucial for models to be well-calibrated, i.e., to assign reliable confidence estimates to their predictions. Uncertainty quantification is particularly important in personalized federated learning (PFL), as participating clients typically have small local datasets, making it difficult to unambiguously determine optimal model parameters. Bayesian PFL (BPFL) methods can potentially enhance calibration, but they often come with considerable computational and memory requirements due to the need to track the variances of all the individual model parameters. Furthermore, different clients may exhibit heterogeneous uncertainty levels owing to varying local dataset sizes and distributions. To address these challenges, we propose LR-BPFL, a novel BPFL method that learns a global deterministic model along with personalized low-rank Bayesian corrections. To tailor the local model to each client's inherent uncertainty level, LR-BPFL incorporates an adaptive rank selection mechanism. We evaluate LR-BPFL across a variety of datasets, demonstrating its advantages in terms of calibration, accuracy, as well as computational and memory requirements.
Abstract:We introduce adaptive learn-then-test (aLTT), an efficient hyperparameter selection procedure that provides finite-sample statistical guarantees on the population risk of AI models. Unlike the existing learn-then-test (LTT) technique, which relies on conventional p-value-based multiple hypothesis testing (MHT), aLTT implements sequential data-dependent MHT with early termination by leveraging e-processes. As a result, aLTT can reduce the number of testing rounds, making it particularly well-suited for scenarios in which testing is costly or presents safety risks. Apart from maintaining statistical validity, in applications such as online policy selection for offline reinforcement learning and hyperparameter tuning for engineering systems, aLTT is shown to achieve the same performance as LTT while requiring only a fraction of the testing rounds.
Abstract:This paper presents communication-constrained distributed conformal risk control (CD-CRC) framework, a novel decision-making framework for sensor networks under communication constraints. Targeting multi-label classification problems, such as segmentation, CD-CRC dynamically adjusts local and global thresholds used to identify significant labels with the goal of ensuring a target false negative rate (FNR), while adhering to communication capacity limits. CD-CRC builds on online exponentiated gradient descent to estimate the relative quality of the observations of different sensors, and on online conformal risk control (CRC) as a mechanism to control local and global thresholds. CD-CRC is proved to offer deterministic worst-case performance guarantees in terms of FNR and communication overhead, while the regret performance in terms of false positive rate (FPR) is characterized as a function of the key hyperparameters. Simulation results highlight the effectiveness of CD-CRC, particularly in communication resource-constrained environments, making it a valuable tool for enhancing the performance and reliability of distributed sensor networks.
Abstract:The information bottleneck (IB) problem is a widely studied framework in machine learning for extracting compressed features that are informative for downstream tasks. However, current approaches to solving the IB problem rely on a heuristic tuning of hyperparameters, offering no guarantees that the learned features satisfy information-theoretic constraints. In this work, we introduce a statistically valid solution to this problem, referred to as IB via multiple hypothesis testing (IB-MHT), which ensures that the learned features meet the IB constraints with high probability, regardless of the size of the available dataset. The proposed methodology builds on Pareto testing and learn-then-test (LTT), and it wraps around existing IB solvers to provide statistical guarantees on the IB constraints. We demonstrate the performance of IB-MHT on classical and deterministic IB formulations, validating the effectiveness of IB-MHT in outperforming conventional methods in terms of statistical robustness and reliability.
Abstract:The increasing adoption of Artificial Intelligence (AI) in engineering problems calls for the development of calibration methods capable of offering robust statistical reliability guarantees. The calibration of black box AI models is carried out via the optimization of hyperparameters dictating architecture, optimization, and/or inference configuration. Prior work has introduced learn-then-test (LTT), a calibration procedure for hyperparameter optimization (HPO) that provides statistical guarantees on average performance measures. Recognizing the importance of controlling risk-aware objectives in engineering contexts, this work introduces a variant of LTT that is designed to provide statistical guarantees on quantiles of a risk measure. We illustrate the practical advantages of this approach by applying the proposed algorithm to a radio access scheduling problem.
Abstract:In modern wireless network architectures, such as O-RAN, artificial intelligence (AI)-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control. The AI "apps" are selected on the basis of contextual information such as network conditions, topology, traffic statistics, and design goals. The mapping between context and AI model parameters is ideally done in a zero-shot fashion via an automatic model selection (AMS) mapping that leverages only contextual information without requiring any current data. This paper introduces a general methodology for the online optimization of AMS mappings. Optimizing an AMS mapping is challenging, as it requires exposure to data collected from many different contexts. Therefore, if carried out online, this initial optimization phase would be extremely time consuming. A possible solution is to leverage a digital twin of the physical system to generate synthetic data from multiple simulated contexts. However, given that the simulator at the digital twin is imperfect, a direct use of simulated data for the optimization of the AMS mapping would yield poor performance when tested in the real system. This paper proposes a novel method for the online optimization of AMS mapping that corrects for the bias of the simulator by means of limited real data collected from the physical system. Experimental results for a graph neural network-based power control app demonstrate the significant advantages of the proposed approach.
Abstract:For modern artificial intelligence (AI) applications such as large language models (LLMs), the training paradigm has recently shifted to pre-training followed by fine-tuning. Furthermore, owing to dwindling open repositories of data and thanks to efforts to democratize access to AI models, pre-training is expected to increasingly migrate from the current centralized deployments to federated learning (FL) implementations. Meta-learning provides a general framework in which pre-training and fine-tuning can be formalized. Meta-learning-based personalized FL (meta-pFL) moves beyond basic personalization by targeting generalization to new agents and tasks. This paper studies the generalization performance of meta-pFL for a wireless setting in which the agents participating in the pre-training phase, i.e., meta-learning, are connected via a shared wireless channel to the server. Adopting over-the-air computing, we study the trade-off between generalization to new agents and tasks, on the one hand, and convergence, on the other hand. The trade-off arises from the fact that channel impairments may enhance generalization, while degrading convergence. Extensive numerical results validate the theory.