Accurate expected time of arrival (ETA) information is crucial in maintaining the quality of service of public transit. Recent advances in artificial intelligence (AI) has led to more effective models for ETA estimation that rely heavily on a large GPS datasets. More importantly, these are mainly cabs based datasets which may not be fit for bus-based public transport. Consequently, the latest methods may not be applicable for ETA estimation in cities with the absence of large training data set. On the other hand, the ETA estimation problem in many cities needs to be solved in the absence of big datasets that also contains outliers, anomalies and may be incomplete. This work presents a simple but robust model for ETA estimation for a bus route that only relies on the historical data of the particular route. We propose a system that generates ETA information for a trip and updates it as the trip progresses based on the real-time information. We train a deep learning based generative model that learns the probability distribution of ETA data across trips and conditional on the current trip information updates the ETA information on the go. Our plug and play model not only captures the non-linearity of the task well but that any transit agency can use without needing any other external data source. The experiments run over three routes, data collected in the city of Delhi illustrates the promise of our approach.