With the recent advancement in Deep Reinforcement Learning in the gaming industry, we are curious if the same technology would work as well for common quantitative financial problems. In this paper, we will investigate if an off-the-shelf library developed by OpenAI can be easily adapted to mean reversion strategy. Moreover, we will design and test to see if we can get better performance by narrowing the function space that the agent needs to search for. We achieve this through augmenting the reward function by a carefully picked penalty term.