Abstract:Many deep architectures and self-supervised pre-training techniques have been proposed for human activity recognition (HAR) from wearable multimodal sensors. Scaling laws have the potential to help move towards more principled design by linking model capacity with pre-training data volume. Yet, scaling laws have not been established for HAR to the same extent as in language and vision. By conducting an exhaustive grid search on both amount of pre-training data and Transformer architectures, we establish the first known scaling laws for HAR. We show that pre-training loss scales with a power law relationship to amount of data and parameter count and that increasing the number of users in a dataset results in a steeper improvement in performance than increasing data per user, indicating that diversity of pre-training data is important, which contrasts to some previously reported findings in self-supervised HAR. We show that these scaling laws translate to downstream performance improvements on three HAR benchmark datasets of postures, modes of locomotion and activities of daily living: UCI HAR and WISDM Phone and WISDM Watch. Finally, we suggest some previously published works should be revisited in light of these scaling laws with more adequate model capacities.
Abstract:Location fingerprinting locates devices based on pattern matching signal observations to a pre-defined signal map. This paper introduces a technique to enable fast signal map creation given a dedicated surveyor with a smartphone and floorplan. Our technique (PFSurvey) uses accelerometer, gyroscope and magnetometer data to estimate the surveyor's trajectory post-hoc using Simultaneous Localisation and Mapping and particle filtering to incorporate a building floorplan. We demonstrate conventional methods can fail to recover the survey path robustly and determine the room unambiguously. To counter this we use a novel loop closure detection method based on magnetic field signals and propose to incorporate the magnetic loop closures and straight-line constraints into the filtering process to ensure robust trajectory recovery. We show this allows room ambiguities to be resolved. An entire building can be surveyed by the proposed system in minutes rather than days. We evaluate in a large office space and compare to state-of-the-art approaches. We achieve trajectories within 1.1 m of the ground truth 90% of the time. Output signal maps well approximate those built from conventional, laborious manual survey. We also demonstrate that the signal maps built by PFSurvey provide similar or even better positioning performance than the manual signal maps.