Reinforcement Learning (RL) has increasingly become a preferred method over traditional rule-based systems in diverse human-in-the-loop (HITL) applications due to its adaptability to the dynamic nature of human interactions. However, integrating RL in such settings raises significant privacy concerns, as it might inadvertently expose sensitive user information. Addressing this, our paper focuses on developing PAPER-HILT, an innovative, adaptive RL strategy through exploiting an early-exit approach designed explicitly for privacy preservation in HITL environments. This approach dynamically adjusts the tradeoff between privacy protection and system utility, tailoring its operation to individual behavioral patterns and preferences. We mainly highlight the challenge of dealing with the variable and evolving nature of human behavior, which renders static privacy models ineffective. PAPER-HILT's effectiveness is evaluated through its application in two distinct contexts: Smart Home environments and Virtual Reality (VR) Smart Classrooms. The empirical results demonstrate PAPER-HILT's capability to provide a personalized equilibrium between user privacy and application utility, adapting effectively to individual user needs and preferences. On average for both experiments, utility (performance) drops by 24%, and privacy (state prediction) improves by 31%.