We consider the problem of learning optimal policies from observational data in a way that satisfies certain fairness criteria. The issue of fairness arises where some covariates used in decision making are sensitive features, or are correlated with sensitive features. (Nabi and Shpitser 2018) formalized fairness in the context of regression problems as constraining the causal effects of sensitive features along certain disallowed causal pathways. The existence of these causal effects may be called retrospective unfairness in the sense of already being present in the data before analysis begins, and may be due to discriminatory practices or the biased way in which variables are defined or recorded. In the context of learning policies, what we call prospective bias, i.e., the inappropriate dependence of learned policies on sensitive features, is also possible. In this paper, we use methods from causal and semiparametric inference to learn optimal policies in a way that addresses both retrospective bias in the data, and prospective bias due to the policy. In addition, our methods appropriately address statistical bias due to model misspecification and confounding bias, which are important in the estimation of path-specific causal effects from observational data. We apply our methods to both synthetic data and real criminal justice data.