Detecting people in images is a challenging problem. Differences in pose, clothing and lighting, along with other factors, cause a lot of variation in their appearance. To overcome these issues, we propose a system based on fused range and thermal infrared images. These measurements show considerably less variation and provide more meaningful information. We provide a brief introduction to the sensor technology used and propose a calibration method. Several data fusion algorithms are compared and their performance is assessed on a simulated data set. The results of initial experiments on real data are analyzed and the measurement errors and the challenges they present are discussed. The resulting fused data are used to efficiently detect people in a fixed camera set-up. The system is extended to include person tracking.