Prosumer operators are dealing with extensive challenges to participate in short-term electricity markets while taking uncertainties into account. Challenges such as variation in demand, solar energy, wind power, and electricity prices as well as faster response time in intraday electricity markets. Machine learning approaches could resolve these challenges due to their ability to continuous learning of complex relations and providing a real-time response. Such approaches are applicable with presence of the high performance computing and big data. To tackle these challenges, a Markov decision process is proposed and solved with a reinforcement learning algorithm with proper observations and actions employing tabular Q-learning. Trained agent converges to a policy which is similar to the global optimal solution. It increases the prosumer's profit by 13.39% compared to the well-known stochastic optimization approach.