The performance of algorithmic decision rules is largely dependent on the quality of training datasets available to them. Biases in these datasets can raise economic and ethical concerns due to the resulting algorithms' disparate treatment of different groups. In this paper, we propose algorithms for sequentially debiasing the training dataset through adaptive and bounded exploration in a classification problem with costly and censored feedback. Our proposed algorithms balance between the ultimate goal of mitigating the impacts of data biases -- which will in turn lead to more accurate and fairer decisions, and the exploration risks incurred to achieve this goal. Specifically, we propose adaptive bounds to limit the region of exploration, and leverage intermediate actions which provide noisy label information at a lower cost. We analytically show that such exploration can help debias data in certain distributions, investigate how {algorithmic fairness interventions} can work in conjunction with our proposed algorithms, and validate the performance of these algorithms through numerical experiments on synthetic and real-world data.