In this paper, we investigate the channel estimation problem for extremely large-scale multi-input and multi-output (XL-MIMO) systems, considering the spherical wavefront effect, spatially non-stationary (SnS) property, and dual-wideband effects. To accurately characterize the XL-MIMO channel, we first derive a novel spatial-and-frequency-domain channel model for XL-MIMO systems and carefully examine the channel characteristics in the angular-and-delay domain. Based on the obtained channel representation, we formulate XL-MIMO channel estimation as a Bayesian inference problem. To fully exploit the clustered sparsity of angular-and-delay channels and capture the inter-antenna and inter-subcarrier correlations, a Markov random field (MRF)-based hierarchical prior model is adopted. Meanwhile, to facilitate efficient channel reconstruction, we propose a sparse Bayesian learning (SBL) algorithm based on approximate message passing (AMP) with a unitary transformation. Tailored to the MRF-based hierarchical prior model, the message passing equations are reformulated using structured variational inference, belief propagation, and mean-field rules. Finally, simulation results validate the convergence and superiority of the proposed algorithm over existing methods.