In this paper, we propose a tightly-coupled, multi-modal simultaneous localization and mapping (SLAM) framework, integrating an extensive set of sensors: IMU, cameras, multiple lidars, and Ultra-wideband (UWB) range measurements, hence referred to as VIRAL (visual-inertial-ranging-lidar) SLAM. To achieve such a comprehensive sensor fusion system, one has to tackle several challenges such as data synchronization, multi-threading programming, bundle adjustment (BA), and conflicting coordinate frames between UWB and the onboard sensors, so as to ensure real-time localization and smooth updates in the state estimates. To this end, we propose a two stage approach. In the first stage, lidar, camera, and IMU data on a local sliding window are processed in a core odometry thread. From this local graph, new key frames are evaluated for admission to a global map. Visual feature-based loop closure is also performed to supplement the global factor graph with loop constraints. When the global factor graph satisfies a condition on spatial diversity, the BA process will be triggered, which updates the coordinate transform between UWB and onboard SLAM systems. The system then seamlessly transitions to the second stage where all sensors are tightly integrated in the odometry thread. The capability of our system is demonstrated via several experiments on high-fidelity graphical-physical simulation and public datasets.