The rise of personalized medicine necessitates improved causal inference methods for detecting treatment effect heterogeneity (TEH). Approaches for estimating TEH with observational data have largely focused on continuous outcomes. Methods for estimating TEH with right-censored survival outcomes are relatively limited and have been less vetted. Using flexible machine/deep learning (ML/DL) methods within the counterfactual framework is a promising approach to address challenges due to complex individual characteristics, to which treatments need to be tailored. We contribute a series of simulations representing a variety of confounded heterogenous survival treatment effect settings with varying degrees of covariate overlap, and compare the operating characteristics of three state-of-the-art survival ML/DL methods for the estimation of TEH. Our results show that the nonparametric Bayesian Additive Regression Trees within the framework of accelerated failure time model (AFT-BART-NP) consistently has the best performance, in terms of both bias and root-mean-squared-error. Additionally, AFT-BART-NP could provide nominal confidence interval coverage when covariate overlap is moderate or strong. Under lack of overlap where accurate estimation of the average causal effect is generally challenging, AFT-BART-NP still provides valid point and interval estimates for the treatment effect among units near the centroid of the propensity score distribution.