Effective and timely responses to unexpected contingencies are crucial for enhancing the resilience of power grids. Given the fast, complex process of cascading propagation, corrective actions such as optimal load shedding (OLS) are difficult to attain in large-scale networks due to the computation complexity and communication latency issues. This work puts forth an innovative learning-for-OLS approach by constructing the optimal decision rules of load shedding under a variety of potential contingency scenarios through offline neural network (NN) training. Notably, the proposed NN-based OLS decisions are fully decentralized, enabling individual load centers to quickly react to the specific contingency using readily available local measurements. Numerical studies on the IEEE 14-bus system have demonstrated the effectiveness of our scalable OLS design for real-time responses to severe grid emergency events.