In this work, we consider a status update system with a sensor and a receiver. The status update information is sampled by the sensor and then forwarded to the receiver through a channel with non-stationary delay distribution. The data freshness at the receiver is quantified by the Age-of-Information (AoI). The goal is to design an online sampling strategy that can minimize the average AoI when the non-stationary delay distribution is unknown. Assuming that channel delay distribution may change over time, to minimize the average AoI, we propose a joint stochastic approximation and non-parametric change point detection algorithm that can: (1) learn the optimum update threshold when the delay distribution remains static; (2) detect the change in transmission delay distribution quickly and then restart the learning process. Simulation results show that the proposed algorithm can quickly detect the delay changes, and the average AoI obtained by the proposed policy converges to the minimum AoI.