The conditional average treatment effect (CATE) is the best point prediction of individual causal effects given individual baseline covariates and can help personalize treatments. However, as CATE only reflects the (conditional) average, it can wash out potential risks and tail events, which are crucially relevant to treatment choice. In aggregate analyses, this is usually addressed by measuring distributional treatment effect (DTE), such as differences in quantiles or tail expectations between treatment groups. Hypothetically, one can similarly fit covariate-conditional quantile regressions in each treatment group and take their difference, but this would not be robust to misspecification or provide agnostic best-in-class predictions. We provide a new robust and model-agnostic methodology for learning the conditional DTE (CDTE) for a wide class of problems that includes conditional quantile treatment effects, conditional super-quantile treatment effects, and conditional treatment effects on coherent risk measures given by $f$-divergences. Our method is based on constructing a special pseudo-outcome and regressing it on baseline covariates using any given regression learner. Our method is model-agnostic in the sense that it can provide the best projection of CDTE onto the regression model class. Our method is robust in the sense that even if we learn these nuisances nonparametrically at very slow rates, we can still learn CDTEs at rates that depend on the class complexity and even conduct inferences on linear projections of CDTEs. We investigate the performance of our proposal in simulation studies, and we demonstrate its use in a case study of 401(k) eligibility effects on wealth.