Abstract:Blind estimation of intersymbol interference channels based on the Baum-Welch (BW) algorithm, a specific implementation of the expectation-maximization (EM) algorithm for training hidden Markov models, is robust and does not require labeled data. However, it is known for its extensive computation cost, slow convergence, and frequently converges to a local maximum. In this paper, we modified the trellis structure of the BW algorithm by associating the channel parameters with two consecutive states. This modification enables us to reduce the number of required states by half while maintaining the same performance. Moreover, to improve the convergence rate and the estimation performance, we construct a joint turbo-BW-equalization system by exploiting the extrinsic information produced by the turbo decoder to refine the BW-based estimator at each EM iteration. Our experiments demonstrate that the joint system achieves convergence in just 4 EM iterations, which is 8 iterations less than a separate system design for a signal-to-noise ratio (SNR) of 6 dB. Additionally, the joint system provides improved estimation accuracy with a mean square error (MSE) of $10^{-4}$. We also identify scenarios where a joint design is not preferable, especially when the channel is noisy (e.g., SNR=2 dB) and the turbo decoder is unable to provide reliable extrinsic information for a BW-based estimator.
Abstract:Adaptive gating plays a key role in temporal data processing via classical recurrent neural networks (RNN), as it facilitates retention of past information necessary to predict the future, providing a mechanism that preserves invariance to time warping transformations. This paper builds on quantum recurrent neural networks (QRNNs), a dynamic model with quantum memory, to introduce a novel class of temporal data processing quantum models that preserve invariance to time-warping transformations of the (classical) input-output sequences. The model, referred to as time warping-invariant QRNN (TWI-QRNN), augments a QRNN with a quantum-classical adaptive gating mechanism that chooses whether to apply a parameterized unitary transformation at each time step as a function of the past samples of the input sequence via a classical recurrent model. The TWI-QRNN model class is derived from first principles, and its capacity to successfully implement time-warping transformations is experimentally demonstrated on examples with classical or quantum dynamics.
Abstract:Deep learning has achieved remarkable success in many machine learning tasks such as image classification, speech recognition, and game playing. However, these breakthroughs are often difficult to translate into real-world engineering systems because deep learning models require a massive number of training samples, which are costly to obtain in practice. To address labeled data scarcity, few-shot meta-learning optimizes learning algorithms that can efficiently adapt to new tasks quickly. While meta-learning is gaining significant interest in the machine learning literature, its working principles and theoretic fundamentals are not as well understood in the engineering community. This review monograph provides an introduction to meta-learning by covering principles, algorithms, theory, and engineering applications. After introducing meta-learning in comparison with conventional and joint learning, we describe the main meta-learning algorithms, as well as a general bilevel optimization framework for the definition of meta-learning techniques. Then, we summarize known results on the generalization capabilities of meta-learning from a statistical learning viewpoint. Applications to communication systems, including decoding and power allocation, are discussed next, followed by an introduction to aspects related to the integration of meta-learning with emerging computing technologies, namely neuromorphic and quantum computing. The monograph is concluded with an overview of open research challenges.
Abstract:Near-term noisy intermediate-scale quantum circuits can efficiently implement implicit probabilistic models in discrete spaces, supporting distributions that are practically infeasible to sample from using classical means. One of the possible applications of such models, also known as Born machines, is probabilistic inference, which is at the core of Bayesian methods. This paper studies the use of Born machines for the problem of training binary Bayesian neural networks. In the proposed approach, a Born machine is used to model the variational distribution of the binary weights of the neural network, and data from multiple tasks is used to reduce training data requirements on new tasks. The method combines gradient-based meta-learning and variational inference via Born machines, and is shown in a prototypical regression problem to outperform conventional joint learning strategies.
Abstract:Quantum machine learning has emerged as a potential practical application of near-term quantum devices. In this work, we study a two-layer hybrid classical-quantum classifier in which a first layer of quantum stochastic neurons implementing generalized linear models (QGLMs) is followed by a second classical combining layer. The input to the first, hidden, layer is obtained via amplitude encoding in order to leverage the exponential size of the fan-in of the quantum neurons in the number of qubits per neuron. To facilitate implementation of the QGLMs, all weights and activations are binary. While the state of the art on training strategies for this class of models is limited to exhaustive search and single-neuron perceptron-like bit-flip strategies, this letter introduces a stochastic variational optimization approach that enables the joint training of quantum and classical layers via stochastic gradient descent. Experiments show the advantages of the approach for a variety of activation functions implemented by QGLM neurons.
Abstract:Data-efficient learning algorithms are essential in many practical applications for which data collection and labeling is expensive or infeasible, e.g., for autonomous cars. To address this problem, meta-learning infers an inductive bias from a set of meta-training tasks in order to learn new, but related, task using a small number of samples. Most studies assume the meta-learner to have access to labeled data sets from a large number of tasks. In practice, one may have available only unlabeled data sets from the tasks, requiring a costly labeling procedure to be carried out before use in standard meta-learning schemes. To decrease the number of labeling requests for meta-training tasks, this paper introduces an information-theoretic active task selection mechanism which quantifies the epistemic uncertainty via disagreements among the predictions obtained under different inductive biases. We detail an instantiation for nonparametric methods based on Gaussian Process Regression, and report its empirical performance results that compare favourably against existing heuristic acquisition mechanisms.
Abstract:Power control in decentralized wireless networks poses a complex stochastic optimization problem when formulated as the maximization of the average sum rate for arbitrary interference graphs. Recent work has introduced data-driven design methods that leverage graph neural network (GNN) to efficiently parametrize the power control policy mapping channel state information (CSI) to the power vector. The specific GNN architecture, known as random edge GNN (REGNN), defines a non-linear graph convolutional architecture whose spatial weights are tied to the channel coefficients, enabling a direct adaption to channel conditions. This paper studies the higher-level problem of enabling fast adaption of the power control policy to time-varying topologies. To this end, we apply first-order meta-learning on data from multiple topologies with the aim of optimizing for a few-shot adaptation to new network configurations.