Abstract:With the decreasing cost of data collection, the space of variables or features that can be used to characterize a particular predictor of interest continues to grow exponentially. Therefore, identifying the most characterizing features that minimizes the variance without jeopardizing the bias of our models is critical to successfully training a machine learning model. In addition, identifying such features is critical for interpretability, prediction accuracy and optimal computation cost. While statistical methods such as subset selection, shrinkage, dimensionality reduction have been applied in selecting the best set of features, some other approaches in literature have approached feature selection task as a search problem where each state in the search space is a possible feature subset. In this paper, we solved the feature selection problem using Reinforcement Learning. Formulating the state space as a Markov Decision Process (MDP), we used Temporal Difference (TD) algorithm to select the best subset of features. Each state was evaluated using a robust and low cost classifier algorithm which could handle any non-linearities in the dataset.