Supervised learning has been introduced to wireless communications to solve complex problems due to its wide applicability. However, generating labels for supervision could be expensive or even unavailable in wireless communications, and constraints cannot be explicitly guaranteed in the supervised manner. In this work, we introduced an unsupervised learning framework, which exploits mathematic models and the knowledge of optimization to search an unknown policy without supervision. Such a framework is applicable to both variable and functional optimization problems with instantaneous and long-term constraints. We take two resource allocation problems in ultra-reliable and low-latency communications as examples, which involve one and two timescales, respectively. Unsupervised learning is adopted to find the approximated optimal solutions of the problems. Simulation results show that the learned solution can achieve the same bandwidth efficiency as the optimal solution in the symmetric scenarios. By comparing the learned solution with the existing policies, our results illustrate the benefit of exploiting frequency diversity and multi-user diversity in improving the bandwidth efficiency in both symmetric and asymmetric scenarios. We further illustrate that, with pre-training, the unsupervised learning algorithm converges rapidly.