The recent research in the emerging technology of reconfigurable intelligent surfaces (RISs) has identified its high potential for localization and sensing. However, to accurately localize a user placed in the area of influence of an RIS, the RIS location needs to be known a priori and its phase profile is required to be optimized for localization. In this paper, we study the problem of the joint localization of a hybrid RIS (HRIS) and a user, considering that the former is equipped with a single reception radio-frequency (RF) chain enabling simultaneous tunable reflections and sensing via power splitting. Focusing on the downlink of a multi-antenna base station, we present a multi-stage approach for the estimation of the HRIS position and orientation as well as the user position. Our simulation results, including comparisons with the Cram\'er-Rao lower bounds, demonstrate the efficiency of the proposed localization approach, while showcasing that there exists an optimal HRIS power splitting ratio for the desired multi-parameter estimation problem.