This paper presents a model that uses the information that sellers publish in real estate market websites to predict whether a property has higher or lower price than the average price of its similar properties. The model learns the correlation between price and information (text descriptions and features) of real estate properties through automatic identification of latent semantic content given by a machine learning model based on doc2vec and xgboost. The proposed model was evaluated with a data set of 57,516 publications of real estate properties collected from 2016 to 2018 of Bogot\'a city. Results show that the accuracy of a classifier that involves text descriptions is slightly higher than a classifier that only uses features of the real estate properties, as text descriptions tends to contain detailed information about the property.