With the rapid commoditization of wearable sensors, detecting human movements from sensor datasets has become increasingly common over a wide range of applications. To detect activities, data scientists iteratively experiment with different classifiers before deciding which model to deploy. Effective reasoning about and comparison of alternative classifiers are crucial in successful model development. This is, however, inherently difficult in developing classifiers for sensor data, where the intricacy of long temporal sequences, high prediction frequency, and imprecise labeling make standard evaluation methods relatively ineffective and even misleading. We introduce Track Xplorer, an interactive visualization system to query, analyze, and compare the predictions of sensor-data classifiers. Track Xplorer enables users to interactively explore and compare the results of different classifiers, and assess their accuracy with respect to the ground-truth labels and video. Through integration with a version control system, Track Xplorer supports tracking of models and their parameters without additional workload on model developers. Track Xplorer also contributes an extensible algebra over track representations to filter, compose, and compare classification outputs, enabling users to reason effectively about classifier performance. We apply Track Xplorer in a collaborative project to develop classifiers to detect movements from multisensor data gathered from Parkinson's disease patients. We demonstrate how Track Xplorer helps identify early on possible systemic data errors, effectively track and compare the results of different classifiers, and reason about and pinpoint the causes of misclassifications.