Understanding the combined influences of meteorological and hydrological factors on water level and flood events is essential, particularly in today's changing climate environments. Transformer, as one kind of the cutting-edge deep learning methods, offers an effective approach to model intricate nonlinear processes, enables the extraction of key features and water level predictions. EXplainable Artificial Intelligence (XAI) methods play important roles in enhancing the understandings of how different factors impact water level. In this study, we propose a Transformer variant by integrating sparse attention mechanism and introducing nonlinear output layer for the decoder module. The variant model is utilized for multi-step forecasting of water level, by considering meteorological and hydrological factors simultaneously. It is shown that the variant model outperforms traditional Transformer across different lead times with respect to various evaluation metrics. The sensitivity analyses based on XAI technology demonstrate the significant influence of meteorological factors on water level evolution, in which temperature is shown to be the most dominant meteorological factor. Therefore, incorporating both meteorological and hydrological factors is necessary for reliable hydrological prediction and flood prevention. In the meantime, XAI technology provides insights into certain predictions, which is beneficial for understanding the prediction results and evaluating the reasonability.