In this paper, we present a novel content caching and delivery approach for mobile virtual reality (VR) video streaming. The proposed approach aims to maximize VR video streaming performance, i.e., minimizing video frame missing rate, by proactively caching popular VR video chunks and adaptively scheduling computing resources at an edge server based on user and network dynamics. First, we design a scalable content placement scheme for deciding which video chunks to cache at the edge server based on tradeoffs between computing and caching resource consumption. Second, we propose a machine learning-assisted VR video delivery scheme, which allocates computing resources at the edge server to satisfy video delivery requests from multiple VR headsets. A Whittle index-based method is adopted to reduce the video frame missing rate by identifying network and user dynamics with low signaling overhead. Simulation results demonstrate that the proposed approach can significantly improve VR video streaming performance over conventional caching and computing resource scheduling strategies.