Poor air quality can have a significant impact on human health. The National Oceanic and Atmospheric Administration (NOAA) air quality forecasting guidance is challenged by the increasing presence of extreme air quality events due to extreme weather events such as wild fires and heatwaves. These extreme air quality events further affect human health. Traditional methods used to correct model bias make assumptions about linearity and the underlying distribution. Extreme air quality events tend to occur without a strong signal leading up to the event and this behavior tends to cause existing methods to either under or over compensate for the bias. Deep learning holds promise for air quality forecasting in the presence of extreme air quality events due to its ability to generalize and learn nonlinear problems. However, in the presence of these anomalous air quality events, standard deep network approaches that use a single network for generalizing to future forecasts, may not always provide the best performance even with a full feature-set including geography and meteorology. In this work we describe a method that combines unsupervised learning and a forecast-aware bi-directional LSTM network to perform bias correction for operational air quality forecasting using AirNow station data for ozone and PM2.5 in the continental US. Using an unsupervised clustering method trained on station geographical features such as latitude and longitude, urbanization, and elevation, the learned clusters direct training by partitioning the training data for the LSTM networks. LSTMs are forecast-aware and implemented using a unique way to perform learning forward and backwards in time across forecasting days. When comparing the RMSE of the forecast model to the RMSE of the bias corrected model, the bias corrected model shows significant improvement (27\% lower RMSE for ozone) over the base forecast.