With the rise of big data technologies, many smart transportation applications have been rapidly developed in recent years including bus arrival time predictions. This type of applications help passengers to plan trips more efficiently without wasting unpredictable amount of waiting time at bus stops. Many studies focus on improving the prediction accuracy of various machine learning and statistical models, while much less work demonstrate their applicability of being deployed and used in realistic urban settings. This paper tries to fill this gap by proposing a general and practical evaluation framework for analysing various widely used prediction models (i.e. delay, k-nearest-neighbour, kernel regression, additive model, and recurrent neural network using long short term memory) for bus arrival time. In particular, this framework contains a raw bus GPS data pre-processing method that needs much less number of input data points while still maintain satisfactory prediction results. This pre-processing method enables various models to predict arrival time at bus stops only, by using a KD-tree based nearest point search method. Based on this framework, using raw bus GPS dataset in different scales from the city of Dublin, Ireland, we also present preliminary results for city managers by analysing the practical strengths and weaknesses in both training and predicting stages of commonly used prediction models.