It is well known that visual SLAM systems based on dense matching are locally accurate but are also susceptible to long-term drift and map corruption. In contrast, feature matching methods can achieve greater long-term consistency but can suffer from inaccurate local pose estimation when feature information is sparse. Based on these observations, we propose an RGB-D SLAM system that leverages the advantages of both approaches: using dense frame-to-model odometry to build accurate sub-maps and on-the-fly feature-based matching across sub-maps for global map optimisation. In addition, we incorporate a learning-based loop closure component based on 3-D features which further stabilises map building. We have evaluated the approach on indoor sequences from public datasets, and the results show that it performs on par or better than state-of-the-art systems in terms of map reconstruction quality and pose estimation. The approach can also scale to large scenes where other systems often fail.