Real-time computer-based accompaniment for human musical performances entails three critical tasks: identifying what the performer is playing, locating their position within the score, and synchronously playing the accompanying parts. Among these, the second task (score following) has been addressed through methods such as dynamic programming on string sequences, Hidden Markov Models (HMMs), and Online Time Warping (OLTW). Yet, the remarkably successful techniques of Deep Learning (DL) have not been directly applied to this problem. Therefore, we introduce HeurMiT, a novel DL-based score-following framework, utilizing a neural architecture designed to learn compressed latent representations that enables precise performer tracking despite deviations from the score. Parallelly, we implement a real-time MIDI data augmentation toolkit, aimed at enhancing the robustness of these learned representations. Additionally, we integrate the overall system with simple heuristic rules to create a comprehensive framework that can interface seamlessly with existing transcription and accompaniment technologies. However, thorough experimentation reveals that despite its impressive computational efficiency, HeurMiT's underlying limitations prevent it from being practical in real-world score following scenarios. Consequently, we present our work as an introductory exploration into the world of DL-based score followers, while highlighting some promising avenues to encourage future research towards robust, state-of-the-art neural score following systems.