Modern machine learning (ML) applications are becoming increasingly complex and monolithic (single chip) accelerator architectures cannot keep up with their energy efficiency and throughput demands. Even though modern digital electronic accelerators are gradually adopting 2.5D architectures with multiple smaller chiplets to improve scalability, they face fundamental limitations due to a reliance on slow metallic interconnects. This paper outlines how optical communication and computation can be leveraged in 2.5D platforms to realize energy-efficient and high throughput 2.5D ML accelerator architectures.