Chaos and unpredictability are traditionally synonymous, yet recent advances in statistical forecasting suggest that large machine learning models can derive unexpected insight from extended observation of complex systems. Here, we study the forecasting of chaos at scale, by performing a large-scale comparison of 24 representative state-of-the-art multivariate forecasting methods on a crowdsourced database of 135 distinct low-dimensional chaotic systems. We find that large, domain-agnostic time series forecasting methods based on artificial neural networks consistently exhibit strong forecasting performance, in some cases producing accurate predictions lasting for dozens of Lyapunov times. Best-in-class results for forecasting chaos are achieved by recently-introduced hierarchical neural basis function models, though even generic transformers and recurrent neural networks perform strongly. However, physics-inspired hybrid methods like neural ordinary equations and reservoir computers contain inductive biases conferring greater data efficiency and lower training times in data-limited settings. We observe consistent correlation across all methods despite their widely-varying architectures, as well as universal structure in how predictions decay over long time intervals. Our results suggest that a key advantage of modern forecasting methods stems not from their architectural details, but rather from their capacity to learn the large-scale structure of chaotic attractors.