Abstract:Health registers contain rich information about individuals' health histories. Here our interest lies in understanding how individuals' health trajectories evolve in a nationwide longitudinal dataset with coded features, such as clinical codes, procedures, and drug purchases. We introduce a straightforward approach for training a Transformer-based deep learning model in a way that lets us analyze how individuals' trajectories change over time. This is achieved by modifying the training objective and by applying a causal attention mask. We focus here on a general task of predicting the onset of a range of common diseases in a given future forecast interval. However, instead of providing a single prediction about diagnoses that could occur in this forecast interval, our approach enable the model to provide continuous predictions at every time point up until, and conditioned on, the time of the forecast period. We find that this model performs comparably to other models, including a bi-directional transformer model, in terms of basic prediction performance while at the same time offering promising trajectory modeling properties. We explore a couple of ways to use this model for analyzing health trajectories and aiding in early detection of events that forecast possible later disease onsets. We hypothesize that this method may be helpful in continuous monitoring of peoples' health trajectories and enabling interventions in ongoing health trajectories, as well as being useful in retrospective analyses.
Abstract:Bayesian deep learning offers a principled approach to train neural networks that accounts for both aleatoric and epistemic uncertainty. In variational inference, priors are often specified over the weight parameters, but they do not capture the true prior knowledge in large and complex neural network architectures. We present a simple approach to incorporate summary information about the predicted probability (such as sigmoid or softmax score) outputs in Bayesian neural networks (BNNs). The available summary information is incorporated as augmented data and modeled with a Dirichlet process, and we derive the corresponding \emph{Summary Evidence Lower BOund}. We show how the method can inform the model about task difficulty or class imbalance. Extensive empirical experiments show that, with negligible computational overhead, the proposed method yields a BNN with a better calibration of uncertainty.
Abstract:Compact neural networks are essential for affordable and power efficient deep learning solutions. Binary Neural Networks (BNNs) take compactification to the extreme by constraining both weights and activations to two levels, $\{+1, -1\}$. However, training BNNs are not easy due to the discontinuity in activation functions, and the training dynamics of BNNs is not well understood. In this paper, we present an information-theoretic perspective of BNN training. We analyze BNNs through the Information Bottleneck principle and observe that the training dynamics of BNNs is considerably different from that of Deep Neural Networks (DNNs). While DNNs have a separate empirical risk minimization and representation compression phases, our numerical experiments show that in BNNs, both these phases are simultaneous. Since BNNs have a less expressive capacity, they tend to find efficient hidden representations concurrently with label fitting. Experiments in multiple datasets support these observations, and we see a consistent behavior across different activation functions in BNNs.
Abstract:Directional beamforming is a crucial component for realizing robust wireless communication systems using millimeter wave (mmWave) technology. Beam alignment using brute-force search of the space introduces time overhead while location aided blind beam alignment adds additional hardware requirements to the system. In this paper, we introduce a method for blind alignment based on the RF fingerprints of user equipment obtained by the base stations. The proposed system performs blind beamforming on a multiple base station cellular environment with multiple mobile users using deep reinforcement learning. We present a novel neural network architecture that can handle a mix of both continuous and discrete actions and use policy gradient methods to train the model. Our results show that the proposed model is able to give a considerable improvement in data rates over traditional methods.
Abstract:Cell Free Massive MIMO is a solution for improving the spectral efficiency of next generation communication systems and a crucial aspect for realizing the gains of the technology is the availability of accurate Channel State Information (CSI). Time Division Duplexing (TDD) mode is popular for Cell Free Massive MIMO since the physical wireless channel's assumed reciprocity facilitates channel estimation. However, the availability of accurate CSI in the TDD mode is hindered by the non reciprocity of the end to end channel, due to the presence of RF components, as well as the non availability of CSI in the subcarriers that do not have reference signals. Hence, the prediction of the Downlink CSI in the subcarriers without reference signals becomes an even more complicated problem. In this work, we consider TDD non-reciprocity with limited availability of resource elements for CSI estimation and propose a deep learning based approach using cascaded Deep Neural Networks (DNNs) to attain a one shot prediction of the reverse channel across the entire bandwidth. The proposed method is able to estimate downlink CSI at all subcarriers from the uplink CSI at selected subcarriers and hence does not require downlink CSI feedback.
Abstract:An approach to design end to end communication system using deep learning leveraging the generative modeling capabilities of autoencoders is presented. The system models are designed using Deep Neural Networks (DNNs) and the objective function for optimizing these models are derived using variational inference. Through experimental validation, the proposed method is shown to produce better models consistently in terms of error rate performance as well as constellation packing density as compared to previous works.
Abstract:Given the increasing number of devices that is going to get connected to wireless networks with the advent of Internet of Things, spectrum scarcity will present a major challenge. Application of opportunistic spectrum access mechanisms to IoT networks will become increasingly important to solve this. In this paper, we present a cognitive radio network architecture which uses multi-stage online learning techniques for spectrum assignment to devices, with the aim of improving the throughput and energy efficiency of the IoT devices. In the first stage, we use an AI technique to learn the quality of a user-channel pairing. The next stage utilizes a non-parametric Bayesian learning algorithm to estimate the Primary User OFF time in each channel. The third stage augments the Bayesian learner with implicit exploration to accelerate the learning procedure. The proposed method leads to significant improvement in throughput and energy efficiency of the IoT devices while keeping the interference to the primary users minimal. We provide comprehensive empirical validation of the method with other learning based approaches.
Abstract:We study how we can adapt a predictor to a non-stationary environment with advises from multiple experts. We study the problem under complete feedback when the best expert changes over time from a decision theoretic point of view. Proposed algorithm is based on popular exponential weighing method with exponential discounting. We provide theoretical results bounding regret under the exponential discounting setting. Upper bound on regret is derived for finite time horizon problem. Numerical verification of different real life datasets are provided to show the utility of proposed algorithm.
Abstract:With the advent of the 5th generation of wireless standards and an increasing demand for higher throughput, methods to improve the spectral efficiency of wireless systems have become very important. In the context of cognitive radio, a substantial increase in throughput is possible if the secondary user can make smart decisions regarding which channel to sense and when or how often to sense. Here, we propose an algorithm to not only select a channel for data transmission but also to predict how long the channel will remain unoccupied so that the time spent on channel sensing can be minimized. Our algorithm learns in two stages - a reinforcement learning approach for channel selection and a Bayesian approach to determine the optimal duration for which sensing can be skipped. Comparisons with other learning methods are provided through extensive simulations. We show that the number of sensing is minimized with negligible increase in primary interference; this implies that lesser energy is spent by the secondary user in sensing and also higher throughput is achieved by saving on sensing.
Abstract:We consider the multi armed bandit problem in non-stationary environments. Based on the Bayesian method, we propose a variant of Thompson Sampling which can be used in both rested and restless bandit scenarios. Applying discounting to the parameters of prior distribution, we describe a way to systematically reduce the effect of past observations. Further, we derive the exact expression for the probability of picking sub-optimal arms. By increasing the exploitative value of Bayes' samples, we also provide an optimistic version of the algorithm. Extensive empirical analysis is conducted under various scenarios to validate the utility of proposed algorithms. A comparison study with various state-of-the-arm algorithms is also included.