Recent technological advancements in retinal surgery has led to the modern operating room consisting of a surgical robot, microscope, and intraoperative optical coherence tomography (iOCT). The integration of these tools raises the fundamental question of how to effectively combine them to enable surgical autonomy. In this work, we address this question by developing a unified framework that enables real-time autonomous surgical workflows utilizing the aforementioned devices. To achieve this, we make the following contributions: (1) we develop a novel imaging system that integrates microscopy and iOCT in real-time, accomplished by dynamically tracking the surgical instrument via a small iOCT scanning region (e.g. B-scan), which was not previously possible; (2) implementing various convolutional neural networks (CNN) that automatically segment and detect task-relevant information for surgical autonomy; (3) enabling surgeons to intuitively select goal waypoints within both the microscope and iOCT views through simple mouse-click interactions; (4) integrating model predictive control (MPC) for real-time trajectory generation that respects kinematic constraints to ensure patient safety. We show the utility of our system by tackling subretinal injection (SI), a challenging procedure that involves inserting a microneedle below the retinal tissue for targeted drug delivery, a task surgeons find challenging due to requiring tens-of-micrometers of accuracy and precise depth perception. We validate our system by conducting 30 successful SI trials on pig eyes, achieving needle insertion accuracy of $26 \pm 12 \mu m$ to various subretinal goals and duration of $55 \pm 10.8$ seconds. Preliminary comparisons to a human operator performing SI in robot-assisted mode highlight the enhanced safety of our system.