Having a precise and robust transformation between the robot coordinate system and the AR-device coordinate system is paramount during human-robot interaction (HRI) based on augmented reality using Head mounted displays (HMD), both for intuitive information display and for the tracking of human motions. Most current solutions in this area rely either on the tracking of visual markers, e.g. QR codes, or on manual referencing, both of which provide unsatisfying results. Meanwhile a plethora of object detection and referencing methods exist in the wider robotic and machine vision communities. The precision of the referencing is likewise almost never measured. Here we would like to address this issue by firstly presenting an overview of currently used referencing methods between robots and HMDs. This is followed by a brief overview of object detection and referencing methods used in the field of robotics. Based on these methods we suggest three classes of referencing algorithms we intend to pursue - semi-automatic, on-shot; automatic, one-shot; and automatic continuous. We describe the general workflows of these three classes as well as describing our proposed algorithms in each of these classes. Finally we present the first experimental results of a semi-automatic referencing algorithm, tested on an industrial KUKA KR-5 manipulator.