Student Member, IEEE
Abstract:We study real-time tracking problem in an energy harvesting system with a Markov source under an imperfect channel. We consider both sampling and transmission costs and different from most prior studies that assume the source is fully observable, the sampling cost renders the source unobservable. The goal is to jointly optimize sampling and transmission policies for three semantic-aware metrics: i) the age of information (AoI), ii) general distortion, and iii) the age of incorrect information (AoII). To this end, we formulate and solve a stochastic control problem. Specifically, for the AoI metric, we cast a Markov decision process (MDP) problem and solve it using relative value iteration (RVI). For the distortion and AoII metrics, we utilize the partially observable MDP (POMDP) modeling and leverage the notion of belief MDP formulation of POMDP to find optimal policies. For the distortion metric and the AoII metric under the perfect channel setup, we effectively truncate the corresponding belief space and solve an MDP problem using RVI. For the general setup, a deep reinforcement learning policy is proposed. Through simulations, we demonstrate significant performance improvements achieved by the derived policies. The results reveal various switching-type structures of optimal policies and show that a distortion-optimal policy is also AoII optimal.
Abstract:We study a real-time tracking problem in an energy harvesting status update system with a Markov source under both sampling and transmission costs. The problem's primary challenge stems from the non-observability of the source due to the sampling cost. By using the age of incorrect information (AoII) as a semantic-aware performance metric, our main goal is to find an optimal policy that minimizes the time average AoII subject to an energy-causality constraint. To this end, a stochastic optimization problem is formulated and solved by modeling it as a partially observable Markov decision process. More specifically, to solve the problem, we use the notion of belief state and by characterizing the belief space, we cast the main problem as an MDP whose cost function is a non-linear function of the age of information (AoI) and solve it via relative value iteration. Simulation results show the effectiveness of the derived policy, with a double-threshold structure on the battery levels and AoI.
Abstract:This letter provides query-age-optimal joint sam- pling and transmission scheduling policies for a heterogeneous status update system, consisting of a stochastic arrival and a generate-at-will source, with an unreliable channel. Our main goal is to minimize the average query age of information (QAoI) subject to average sampling, average transmission, and per-slot transmission constraints. To this end, an optimization problem is formulated and solved by casting it into a linear program. We also provide a low-complexity near-optimal policy using the notion of weakly coupled constrained Markov decision processes. The numerical results show up to 32% performance improvement by the proposed policies compared with a benchmark policy.
Abstract:We consider a multi-source relaying system where the independent sources randomly generate status update packets which are sent to the destination with the aid of a relay through unreliable links. We develop scheduling policies to minimize the sum average age of information (AoI) subject to transmission capacity and long-run average resource constraints. We formulate a stochastic optimization problem and solve it under two different scenarios regarding the knowledge of system statistics: known environment and unknown environment. For the known environment, a constrained Markov decision process (CMDP) approach and a drift-plus-penalty method are proposed. The CMDP problem is solved by transforming it into an MDP problem using the Lagrangian relaxation method. We theoretically analyze the structure of optimal policies for the MDP problem and subsequently propose a structure-aware algorithm that returns a practical near-optimal policy. By the drift-plus-penalty method, we devise a dynamic near-optimal low-complexity policy. For the unknown environment, we develop a deep reinforcement learning policy by employing the Lyapunov optimization theory and a dueling double deep Q-network. Simulation results are provided to assess the performance of our policies and validate the theoretical results. The results show up to 91% performance improvement compared to a baseline policy.
Abstract:We consider a multi-source relaying system where the sources independently and randomly generate status update packets which are sent to the destination with the aid of a bufferaided relay through unreliable links. We formulate a stochastic optimization problem aiming to minimize the sum average age of information (AAoI) of sources under per-slot transmission capacity constraints and a long-run average resource constraint. To solve the problem, we recast it as a constrained Markov decision process (CMDP) problem and adopt the Lagrangian method. We analyze the structure of an optimal policy for the resulting MDP problem that possesses a switching-type structure. We propose an algorithm that obtains a stationary deterministic near-optimal policy, establishing a benchmark for the system. Simulation results show the effectiveness of our algorithm compared to benchmark algorithms.
Abstract:This paper studies a novel approach for successive interference cancellation (SIC) ordering and beamforming in a multiple antennas non-orthogonal multiple access (NOMA) network with multi-carrier multi-user setup. To this end, we formulate a joint beamforming design, subcarrier allocation, user association, and SIC ordering algorithm to maximize the worst-case energy efficiency (EE). The formulated problem is a non-convex mixed integer non-linear programming (MINLP) which is generally difficult to solve. To handle it, we first adopt the linearizion technique as well as relaxing the integer variables, and then we employ the Dinkelbach algorithm to convert it into a more mathematically tractable form. The adopted non-convex optimization problem is transformed into an equivalent rank-constrained semidefinite programming (SDP) and is solved by SDP relaxation and exploiting sequential fractional programming. Furthermore, to strike a balance between complexity and performance, a low complex approach based on alternative optimization is adopted. Numerical results unveil that the proposed SIC ordering method outperforms the conventional existing works addressed in the literature.