Accurate prediction of public transit ridership is vital for efficient planning and management of transit in rapidly growing urban areas in Canada. Unexpected increases in passengers can cause overcrowded vehicles, longer boarding times, and service disruptions. Traditional time series models like ARIMA and SARIMA face limitations, particularly in short-term predictions and integration of spatial and temporal features. These models struggle with the dynamic nature of ridership patterns and often ignore spatial correlations between nearby stops. Deep Learning (DL) models present a promising alternative, demonstrating superior performance in short-term prediction tasks by effectively capturing both spatial and temporal features. However, challenges such as dynamic spatial feature extraction, balancing accuracy with computational efficiency, and ensuring scalability remain. This paper introduces DST-TransitNet, a hybrid DL model for system-wide station-level ridership prediction. This proposed model uses graph neural networks (GNN) and recurrent neural networks (RNN) to dynamically integrate the changing temporal and spatial correlations within the stations. The model also employs a precise time series decomposition framework to enhance accuracy and interpretability. Tested on Bogota's BRT system data, with three distinct social scenarios, DST-TransitNet outperformed state-of-the-art models in precision, efficiency and robustness. Meanwhile, it maintains stability over long prediction intervals, demonstrating practical applicability.