There has been huge recent interest in the potential of making operational weather forecasts using machine learning techniques. As they become a part of the weather forecasting toolbox, there is a pressing need to understand how well current machine learning models can simulate high-impactweather events. We compare forecasts of Storm Ciar\'an, a European windstorm that caused sixteen deaths and extensive damage in Northern Europe, made by machine learning and numericalweather prediction models. The four machine learning models considered (FourCastNet, Pangu-Weather, GraphCast and FourCastNet-v2) produce forecasts that accurately capture the synoptic-scale structure of the cyclone including the position of the cloud head, shape of the warm sector and location of warm conveyor belt jet, and the large-scale dynamical drivers important for the rapid storm development such as the position of the storm relative to the upper-level jet exit. However, their ability to resolve the more detailed structures important for issuing weather warnings is more mixed. All of the machine learning models underestimate the peak amplitude of winds associated with the storm, only some machine learning models resolve the warm core seclusion and none of the machine learning models capture the sharp bent-back warm frontal gradient. Our study shows there is a great deal about the performance and properties of machine learning weather forecasts that can be derived from case studies of high-impact weather events such as Storm Ciar\'an.