With the recent development of autonomous driving technology, as the pursuit of efficiency for repetitive tasks and the value of non-face-to-face services increase, mobile service robots such as delivery robots and serving robots attract attention, and their demands are increasing day by day. However, when something goes wrong, most commercial serving robots need to return to their starting position and orientation to operate normally again. In this paper, we focus on end-to-end relocalization of serving robots to address the problem. It is to predict robot pose directly from only the onboard sensor data using neural networks. In particular, we propose a deep neural network architecture for the relocalization based on camera-2D LiDAR sensor fusion. We call the proposed method FusionLoc. In the proposed method, the multi-head self-attention complements different types of information captured by the two sensors. Our experiments on a dataset collected by a commercial serving robot demonstrate that FusionLoc can provide better performances than previous relocalization methods taking only a single image or a 2D LiDAR point cloud as well as a straightforward fusion method concatenating their features.