Abstract:The Kalman filter (KF) is a widely-used algorithm for tracking dynamic systems that are captured by state space (SS) models. The need to fully describe a SS model limits its applicability under complex settings, e.g., when tracking based on visual data, and the processing of high-dimensional signals often induces notable latency. These challenges can be treated by mapping the measurements into latent features obeying some postulated closed-form SS model, and applying the KF in the latent space. However, the validity of this approximated SS model may constitute a limiting factor. In this work, we study tracking from high-dimensional measurements under complex settings using a hybrid model-based/data-driven approach. By gradually tackling the challenges in handling the observations model and the task, we develop Latent-KalmanNet, which implements tracking from high-dimensional measurements by leveraging data to jointly learn the KF along with the latent space mapping. Latent-KalmanNet combines a learned encoder with data-driven tracking in the latent space using the recently proposed-KalmanNet, while identifying the ability of each of these trainable modules to assist its counterpart via providing a suitable prior (by KalmanNet) and by learning a latent representation that facilitates data-aided tracking (by the encoder). Our empirical results demonstrate that the proposed Latent-KalmanNet achieves improved accuracy and run-time performance over both model-based and data-driven techniques by learning a surrogate latent representation that most facilitates tracking, while operating with limited complexity and latency.