Compared to rigid robots that are often studied in reinforcement learning, the physical characteristics of some sophisticated robots such as software or continuum are more complicated. Moreover, recent reinforcement learning methods are data-inefficient and can not be directly deployed to the robot without simulation. In this paper, we propose an efficient reinforcement learning method based on inexplicit prior knowledge in response to such problems. The method is firstly corroborated by simulation and employed directly in the real world. By using our method, we can achieve visual active tracking and distance maintenance of a tendon-driven robot which will be critical in minimally-invasive procedures.