Advancing the capabilities of earthquake nowcasting, the real-time forecasting of seismic activities remains a crucial and enduring objective aimed at reducing casualties. This multifaceted challenge has recently gained attention within the deep learning domain, facilitated by the availability of extensive, long-term earthquake datasets. Despite significant advancements, existing literature on earthquake nowcasting lacks comprehensive evaluations of pre-trained foundation models and modern deep learning architectures. These architectures, such as transformers or graph neural networks, uniquely focus on different aspects of data, including spatial relationships, temporal patterns, and multi-scale dependencies. This paper addresses the mentioned gap by analyzing different architectures and introducing two innovation approaches called MultiFoundationQuake and GNNCoder. We formulate earthquake nowcasting as a time series forecasting problem for the next 14 days within 0.1-degree spatial bins in Southern California, spanning from 1986 to 2024. Earthquake time series is forecasted as a function of logarithm energy released by quakes. Our comprehensive evaluation employs several key performance metrics, notably Nash-Sutcliffe Efficiency and Mean Squared Error, over time in each spatial region. The results demonstrate that our introduced models outperform other custom architectures by effectively capturing temporal-spatial relationships inherent in seismic data. The performance of existing foundation models varies significantly based on the pre-training datasets, emphasizing the need for careful dataset selection. However, we introduce a new general approach termed MultiFoundationPattern that combines a bespoke pattern with foundation model results handled as auxiliary streams. In the earthquake case, the resultant MultiFoundationQuake model achieves the best overall performance.