Recent years have seen a surge of interest in the algorithmic estimation of stochastic entropy production (EP) from the trajectory data via machine learning. A crucial element of such algorithms is the identification of a loss function whose minimization guarantees the accurate EP estimation. In this study, we show that there exists a host of loss functions, namely those implementing a variational representation of the $\alpha$-divergence, which can be used for the EP estimation. Among these loss functions, the one corresponding to $\alpha = -0.5$ exhibits the most robust performance against strong nonequilibrium driving or slow dynamics, which adversely affects the existing method based on the Kullback-Leibler divergence ($\alpha = 0$). To corroborate our findings, we present an exactly solvable simplification of the EP estimation problem, whose loss function landscape and stochastic properties demonstrate the optimality of $\alpha = -0.5$.