Failure management plays a significant role in optical networks. It ensures secure operation, mitigates potential risks, and executes proactive protection. Machine learning (ML) is considered to be an extremely powerful technique for performing comprehensive data analysis and complex network management and is widely utilized for failure management in optical networks to revolutionize the conventional manual methods. In this study, the background of failure management is introduced, where typical failure tasks, physical objects, ML algorithms, data source, and extracted information are illustrated in detail. An overview of the applications of ML in failure management is provided in terms of alarm analysis, failure prediction, failure detection, failure localization, and failure identification. Finally, the future directions on ML for failure management are discussed from the perspective of data, model, task, and emerging techniques.