Abstract:Advanced real-time location systems (RTLS) allow for collecting spatio-temporal data from human movement behaviours. Tracking individuals in small areas such as schoolyards or nursing homes might impose difficulties for RTLS in terms of positioning accuracy. However, to date, few studies have investigated the performance of different localisation systems regarding the classification of human movement patterns in small areas. The current study aims to design and evaluate an automated framework to classify human movement trajectories obtained from two different RTLS: Global Navigation Satellite System (GNSS) and Ultra-wideband (UWB), in areas of approximately 100 square meters. Specifically, we designed a versatile framework which takes GNSS or UWB data as input, extracts features from these data and classifies them according to the annotated spatial patterns. The automated framework contains three choices for applying noise removal: (i) no noise removal, (ii) Savitzky Golay filter on the raw location data or (iii) Savitzky Golay filter on the extracted features, as well as three choices regarding the classification algorithm: Decision Tree (DT), Random Forest (RF) or Support Vector Machine (SVM). We integrated different stages within the framework with the Sequential Model-Based Algorithm Configuration (SMAC) to perform automated hyperparameter optimisation. The best performance is achieved with a pipeline consisting of noise removal applied to the raw location data with an RF model for the GNSS and no noise removal with an SVM model for the UWB. We further demonstrate through statistical analysis that the UWB achieves significantly higher results than the GNSS in classifying movement patterns.
Abstract:As the popularity of Location-based Social Networks (LBSNs) increases, designing accurate models for Point-of-Interest (POI) recommendation receives more attention. POI recommendation is often performed by incorporating contextual information into previously designed recommendation algorithms. Some of the major contextual information that has been considered in POI recommendation are the location attributes (i.e., exact coordinates of a location, category, and check-in time), the user attributes (i.e., comments, reviews, tips, and check-in made to the locations), and other information, such as the distance of the POI from user's main activity location, and the social tie between users. The right selection of such factors can significantly impact the performance of the POI recommendation. However, previous research does not consider the impact of the combination of these different factors. In this paper, we propose different contextual models and analyze the fusion of different major contextual information in POI recommendation. The major contributions of this paper are: (i) providing an extensive survey of context-aware location recommendation (ii) quantifying and analyzing the impact of different contextual information (e.g., social, temporal, spatial, and categorical) in the POI recommendation on available baselines and two new linear and non-linear models, that can incorporate all the major contextual information into a single recommendation model, and (iii) evaluating the considered models using two well-known real-world datasets. Our results indicate that while modeling geographical and temporal influences can improve recommendation quality, fusing all other contextual information into a recommendation model is not always the best strategy.
Abstract:Unsupervised discretization is a crucial step in many knowledge discovery tasks. The state-of-the-art method for one-dimensional data infers locally adaptive histograms using the minimum description length (MDL) principle, but the multi-dimensional case is far less studied: current methods consider the dimensions one at a time (if not independently), which result in discretizations based on rectangular cells of adaptive size. Unfortunately, this approach is unable to adequately characterize dependencies among dimensions and/or results in discretizations consisting of more cells (or bins) than is desirable. To address this problem, we propose an expressive model class that allows for far more flexible partitions of two-dimensional data. We extend the state of the art for the one-dimensional case to obtain a model selection problem based on the normalised maximum likelihood, a form of refined MDL. As the flexibility of our model class comes at the cost of a vast search space, we introduce a heuristic algorithm, named PALM, which partitions each dimension alternately and then merges neighbouring regions, all using the MDL principle. Experiments on synthetic data show that PALM 1) accurately reveals ground truth partitions that are within the model class (i.e., the search space), given a large enough sample size; 2) approximates well a wide range of partitions outside the model class; 3) converges, in contrast to its closest competitor IPD; and 4) is self-adaptive with regard to both sample size and local density structure of the data despite being parameter-free. Finally, we apply our algorithm to two geographic datasets to demonstrate its real-world potential.
Abstract:With the popularity of Location-based Social Networks, Point-of-Interest (POI) recommendation has become an important task, which learns the users' preferences and mobility patterns to recommend POIs. Previous studies show that incorporating contextual information such as geographical and temporal influences is necessary to improve POI recommendation by addressing the data sparsity problem. However, existing methods model the geographical influence based on the physical distance between POIs and users, while ignoring the temporal characteristics of such geographical influences. In this paper, we perform a study on the user mobility patterns where we find out that users' check-ins happen around several centers depending on their current temporal state. Next, we propose a spatio-temporal activity-centers algorithm to model users' behavior more accurately. Finally, we demonstrate the effectiveness of our proposed contextual model by incorporating it into the matrix factorization model under two different settings: i) static and ii) temporal. To show the effectiveness of our proposed method, which we refer to as STACP, we conduct experiments on two well-known real-world datasets acquired from Gowalla and Foursquare LBSNs. Experimental results show that the STACP model achieves a statistically significant performance improvement, compared to the state-of-the-art techniques. Also, we demonstrate the effectiveness of capturing geographical and temporal information for modeling users' activity centers and the importance of modeling them jointly.
Abstract:In this paper, we address the problem of how automated situation-awareness can be achieved by learning real-world situations from ubiquitously generated mobility data. Without semantic input about the time and space where situations take place, this turns out to be a fundamental challenging problem. Uncertainties also introduce technical challenges when data is generated in irregular time intervals, being mixed with noise, and errors. Purely relying on temporal patterns observable in mobility data, in this paper, we propose Spaceprint, a fully automated algorithm for finding the repetitive pattern of similar situations in spaces. We evaluate this technique by showing how the latent variables describing the category, and the actual identity of a space can be discovered from the extracted situation patterns. Doing so, we use different real-world mobility datasets with data about the presence of mobile entities in a variety of spaces. We also evaluate the performance of this technique by showing its robustness against uncertainties.