We consider status updating under inexact knowledge of the battery level of an energy harvesting (EH) sensor that sends status updates about a random process to users via a cache-enabled edge node. More precisely, the control decisions are performed by relying only on the battery level knowledge captured from the last received status update packet. Upon receiving on-demand requests for fresh information from the users, the edge node uses the available information to decide whether to command the sensor to send a status update or to retrieve the most recently received measurement from the cache. We seek for the best actions of the edge node to minimize the average AoI of the served measurements, i.e., average on-demand AoI. Accounting for the partial battery knowledge, we model the problem as a partially observable Markov decision process (POMDP), and, through characterizing its key structures, develop a dynamic programming algorithm to obtain an optimal policy. Simulation results illustrate the threshold-based structure of an optimal policy and show the gains obtained by the proposed optimal POMDP-based policy compared to a request-aware greedy (myopic) policy.