Abstract:In the present work, a reinforcement learning (RL) based adaptive algorithm to optimise the transmit beampattern for a colocated massive MIMO radar is presented. Under the massive MIMO regime, a robust Wald type detector, able to guarantee certain detection performances under a wide range of practical disturbance models, has been recently proposed. Furthermore, an RL/cognitive methodology has been exploited to improve the detection performance by learning and interacting with the surrounding unknown environment. Building upon previous findings, we develop here a fully adaptive and data driven scheme for the selection of the hyper-parameters involved in the RL algorithm. Such an adaptive selection makes the Wald RL based detector independent of any ad hoc, and potentially suboptimal, manual tuning of the hyper-parameters. Simulation results show the effectiveness of the proposed scheme in harsh scenarios with strong clutter and low SNR values.