Robust autonomous driving requires agents to accurately identify unexpected areas in urban scenes. To this end, some critical issues remain open: how to design advisable metric to measure anomalies, and how to properly generate training samples of anomaly data? Previous effort usually resorts to uncertainty estimation and sample synthesis from classification tasks, which ignore the context information and sometimes requires auxiliary datasets with fine-grained annotations. On the contrary, in this paper, we exploit the strong context-dependent nature of segmentation task and design an energy-guided self-supervised frameworks for anomaly segmentation, which optimizes an anomaly head by maximizing the likelihood of self-generated anomaly pixels. To this end, we design two estimators for anomaly likelihood estimation, one is a simple task-agnostic binary estimator and the other depicts anomaly likelihood as residual of task-oriented energy model. Based on proposed estimators, we further incorporate our framework with likelihood-guided mask refinement process to extract informative anomaly pixels for model training. We conduct extensive experiments on challenging Fishyscapes and Road Anomaly benchmarks, demonstrating that without any auxiliary data or synthetic models, our method can still achieves competitive performance to other SOTA schemes.