Measuring and analyzing the flow of customers in retail stores is essential for a retailer to better comprehend customers' behavior and support decision-making. Nevertheless, not much attention has been given to the development of novel technologies for automatic people counting. We introduce LRCN-RetailNet: a recurrent neural network architecture capable of learning a non-linear regression model and accurately predicting the people count from videos captured by low-cost surveillance cameras. The input video format follows the recently proposed RGBP image format, which is comprised of color and people (foreground) information. Our architecture is capable of considering two relevant aspects: spatial features extracted through convolutional layers from the RGBP images; and the temporal coherence of the problem, which is exploited by recurrent layers. We show that, through a supervised learning approach, the trained models are capable of predicting the people count with high accuracy. Additionally, we present and demonstrate that a straightforward modification of the methodology is effective to exclude salespeople from the people count. Comprehensive experiments were conducted to validate, evaluate and compare the proposed architecture. Results corroborated that LRCN-RetailNet remarkably outperforms both the previous RetailNet architecture, which was limited to evaluating a single image per iteration; and a state-of-the-art neural network for object detection. Finally, computational performance experiments confirmed that the entire methodology is effective to estimate people count in real-time.