Rapid advancements in the E-commerce sector over the last few decades have led to an imminent need for personalised, efficient and dynamic recommendation systems. To sufficiently cater to this need, we propose a novel method for generating top-k recommendations by creating an ensemble of clustering with reinforcement learning. We have incorporated DB Scan clustering to tackle vast item space, hence in-creasing the efficiency multi-fold. Moreover, by using deep contextual reinforcement learning, our proposed work leverages the user features to its full potential. With partial updates and batch updates, the model learns user patterns continuously. The Duelling Bandit based exploration provides robust exploration as compared to the state-of-art strategies due to its adaptive nature. Detailed experiments conducted on a public dataset verify our claims about the efficiency of our technique as com-pared to existing techniques.