Abstract:A significant number of images shared on social media platforms such as Facebook and Instagram contain text in various forms. It's increasingly becoming commonplace for bad actors to share misinformation, hate speech or other kinds of harmful content as text overlaid on images on such platforms. A scene-text understanding system should hence be able to handle text in various orientations that the adversary might use. Moreover, such a system can be incorporated into screen readers used to aid the visually impaired. In this work, we extend the scene-text extraction system at Facebook, Rosetta, to efficiently handle text in various orientations. Specifically, we incorporate the Rotation Region Proposal Networks (RRPN) in our text extraction pipeline and offer practical suggestions for building and deploying a model for detecting and recognizing text in arbitrary orientations efficiently. Experimental results show a significant improvement on detecting rotated text.