The integration of vision-based frameworks to achieve lunar robot applications faces numerous challenges such as terrain configuration or extreme lighting conditions. This paper presents a generic task pipeline using object detection, instance segmentation and grasp detection, that can be used for various applications by using the results of these vision-based systems in a different way. We achieve a rock stacking task on a non-flat surface in difficult lighting conditions with a very good success rate of 92%. Eventually, we present an experiment to assemble 3D printed robot components to initiate more complex tasks in the future.