Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Matthew Andrews

Learning-Based Multiuser Scheduling in MIMO-OFDM Systems with Hybrid Beamforming

Jun 09, 2025

Pouya Agheli, Tugce Kobal, François Durand, Matthew Andrews

Abstract:We investigate the multiuser scheduling problem in multiple-input multiple-output (MIMO) systems using orthogonal frequency division multiplexing (OFDM) and hybrid beamforming in which a base station (BS) communicates with multiple users over millimeter wave (mmWave) channels in the downlink. Improved scheduling is critical for enhancing spectral efficiency and the long-term performance of the system from the perspective of proportional fairness (PF) metric in hybrid beamforming systems due to its limited multiplexing gain. Our objective is to maximize PF by properly designing the analog and digital precoders within the hybrid beamforming and selecting the users subject to the number of radio frequency (RF) chains. Leveraging the characteristics of mmWave channels, we apply a two-timescale protocol. On a long timescale, we assign an analog beam to each user. Scheduling the users and designing the digital precoder are done accordingly on a short timescale. To conduct scheduling, we propose combinatorial solutions, such as greedy and sorting algorithms, followed by a machine learning (ML) approach. Our numerical results highlight the trade-off between the performance and complexity of the proposed approaches. Consequently, we show that the choice of approach depends on the specific criteria within a given scenario.

* To appear in the proceedings of the European Conference on Networks and Communications (EuCNC) & 6G Summit, 2025

Via

Access Paper or Ask Questions

Hybrid Classical/RL Local Planner for Ground Robot Navigation

Oct 04, 2024

Vishnu D. Sharma, Jeongran Lee, Matthew Andrews, Ilija Hadžić

Abstract:Local planning is an optimization process within a mobile robot navigation stack that searches for the best velocity vector, given the robot and environment state. Depending on how the optimization criteria and constraints are defined, some planners may be better than others in specific situations. We consider two conceptually different planners. The first planner explores the velocity space in real-time and has superior path-tracking and motion smoothness performance. The second planner was trained using reinforcement learning methods to produce the best velocity based on its training $"$experience$"$. It is better at avoiding dynamic obstacles but at the expense of motion smoothness. We propose a simple yet effective meta-reasoning approach that takes advantage of both approaches by switching between planners based on the surroundings. We demonstrate the superiority of our hybrid planner, both qualitatively and quantitatively, over the individual planners on a live robot in different scenarios, achieving an improvement of 26% in the navigation time.

Via

Access Paper or Ask Questions

SACPlanner: Real-World Collision Avoidance with a Soft Actor Critic Local Planner and Polar State Representations

Mar 21, 2023

Khaled Nakhleh, Minahil Raza, Mack Tang, Matthew Andrews, Rinu Boney, Ilija Hadzic, Jeongran Lee, Atefeh Mohajeri, Karina Palyutina

Figure 1 for SACPlanner: Real-World Collision Avoidance with a Soft Actor Critic Local Planner and Polar State Representations

Figure 2 for SACPlanner: Real-World Collision Avoidance with a Soft Actor Critic Local Planner and Polar State Representations

Figure 3 for SACPlanner: Real-World Collision Avoidance with a Soft Actor Critic Local Planner and Polar State Representations

Figure 4 for SACPlanner: Real-World Collision Avoidance with a Soft Actor Critic Local Planner and Polar State Representations

Abstract:We study the training performance of ROS local planners based on Reinforcement Learning (RL), and the trajectories they produce on real-world robots. We show that recent enhancements to the Soft Actor Critic (SAC) algorithm such as RAD and DrQ achieve almost perfect training after only 10000 episodes. We also observe that on real-world robots the resulting SACPlanner is more reactive to obstacles than traditional ROS local planners such as DWA.

* Accepted at 2023 IEEE International Conference on Robotics and Automation (ICRA)

Via

Access Paper or Ask Questions

Learning-Based Adaptive User Selection in Millimeter Wave Hybrid Beamforming Systems

Feb 16, 2023

Junghoon Kim, Matthew Andrews

Abstract:We consider a multi-user hybrid beamforming system, where the multiplexing gain is limited by the small number of RF chains employed at the base station (BS). To allow greater freedom for maximizing the multiplexing gain, it is better if the BS selects and serves some of the users at each scheduling instant, rather than serving all the users all the time. We adopt a two-timescale protocol that takes into account the mmWave characteristics, where at the long timescale an analog beam is chosen for each user, and at the short timescale users are selected for transmission based on the chosen analog beams. The goal of the user selection is to maximize the traditional Proportional Fair (PF) metric. However, this maximization is non-trivial due to interference between the analog beams for selected users. We first define a greedy algorithm and a "top-k" algorithm, and then propose a machine learning (ML)-based user selection algorithm to provide an efficient trade-off between the PF performance and the computation time. Throughout simulations, we analyze the performance of the ML-based algorithms under various metrics, and show that it gives an efficient trade-off in performance as compared to counterparts.

* Accepted for publication in IEEE International Conference on Communications (ICC), 2023

Via

Access Paper or Ask Questions

Learning Algorithms for Regenerative Stopping Problems with Applications to Shipping Consolidation in Logistics

May 05, 2021

Kishor Jothimurugan, Matthew Andrews, Jeongran Lee, Lorenzo Maggi

Figure 1 for Learning Algorithms for Regenerative Stopping Problems with Applications to Shipping Consolidation in Logistics

Figure 2 for Learning Algorithms for Regenerative Stopping Problems with Applications to Shipping Consolidation in Logistics

Figure 3 for Learning Algorithms for Regenerative Stopping Problems with Applications to Shipping Consolidation in Logistics

Figure 4 for Learning Algorithms for Regenerative Stopping Problems with Applications to Shipping Consolidation in Logistics

Abstract:We study regenerative stopping problems in which the system starts anew whenever the controller decides to stop and the long-term average cost is to be minimized. Traditional model-based solutions involve estimating the underlying process from data and computing strategies for the estimated model. In this paper, we compare such solutions to deep reinforcement learning and imitation learning which involve learning a neural network policy from simulations. We evaluate the different approaches on a real-world problem of shipping consolidation in logistics and demonstrate that deep learning can be effectively used to solve such problems.

Via

Access Paper or Ask Questions

Evolution of Q Values for Deep Q Learning in Stable Baselines

Apr 24, 2020

Matthew Andrews, Cemil Dibek, Karina Palyutina

Figure 1 for Evolution of Q Values for Deep Q Learning in Stable Baselines

Figure 2 for Evolution of Q Values for Deep Q Learning in Stable Baselines

Figure 3 for Evolution of Q Values for Deep Q Learning in Stable Baselines

Figure 4 for Evolution of Q Values for Deep Q Learning in Stable Baselines

Abstract:We investigate the evolution of the Q values for the implementation of Deep Q Learning (DQL) in the Stable Baselines library. Stable Baselines incorporates the latest Reinforcement Learning techniques and achieves superhuman performance in many game environments. However, for some simple non-game environments, the DQL in Stable Baselines can struggle to find the correct actions. In this paper we aim to understand the types of environment where this suboptimal behavior can happen, and also investigate the corresponding evolution of the Q values for individual states. We compare a smart TrafficLight environment (where performance is poor) with the AI Gym FrozenLake environment (where performance is perfect). We observe that DQL struggles with TrafficLight because actions are reversible and hence the Q values in a given state are closer than in FrozenLake. We then investigate the evolution of the Q values using a recent decomposition technique of Achiam et al.. We observe that for TrafficLight, the function approximation error and the complex relationships between the states lead to a situation where some Q values meander far from optimal.

Via

Access Paper or Ask Questions