This paper proposes an unmanned aerial vehicle (UAV) aided content management system in communication-challenged disaster scenarios. Without cellular infrastructure in such scenarios, community of stranded users can be provided access to situation-critical contents using a hybrid network of static and traveling UAVs. A set of relatively static anchor UAVs can download content from central servers and provide content access to its local users. A set of ferrying UAVs with wider mobility can provision content to users by shuffling them across different anchor UAVs while visiting different communities of users. The objective is to design a content dissemination system that on-the-fly learns content caching policies for maximizing content availability to the stranded users. This paper proposes a decentralized Top-k Multi-Armed Bandit Learning model for UAV-caching decision-making that takes geo-temporal differences in content popularity and heterogeneity in content demands into consideration. The proposed paradigm is able to combine the expected reward maximization attribute and a proposed multi-dimensional reward structure of Top-k Multi-Armed Bandit, for caching decision at the UAVs. This study is done for different user-specified tolerable access delay, heterogeneous popularity distributions, and inter-community geographical characteristics. Functional verification and performance evaluation of the proposed caching framework is done for a wide range of network size, UAV distribution, and content popularity.