Abstract:Understanding human mobility is essential for many fields, including transportation planning. Currently, surveys are the primary source for such analysis. However, in the recent past, many researchers have focused on Call Detail Records (CDR) for identifying travel patterns. CDRs have shown correlation to human mobility behavior. However, one of the main issues in using CDR data is that it is difficult to identify the precise location of the user due to the low spacial resolution of the data and other artifacts such as the load sharing effect. Existing approaches have certain limitations. Previous studies using CDRs do not consider the transmit power of cell towers when localizing the users and use an oversimplified approach to identify load sharing effects. Furthermore, they consider the entire population of users as one group neglecting the differences in mobility patterns of different segments of users. This research introduces a novel methodology to user position localization from CDRs through improved detection of load sharing effects, by taking the transmit power into account, and segmenting the users into distinct groups for the purpose of learning any parameters of the model. Moreover, this research uses several methods to address the existing limitations and validate the generated results using nearly 4 billion CDR data points with travel survey data and voluntarily collected mobile data.
Abstract:Gliomas are lethal type of central nervous system tumors with a poor prognosis. Recently, with the advancements in the micro-array technologies thousands of gene expression related data of glioma patients are acquired, leading for salient analysis in many aspects. Thus, genomics are been emerged into the field of prognosis analysis. In this work, we identify survival related 7 gene signature and explore two approaches for survival prediction and risk estimation. For survival prediction, we propose a novel probabilistic programming based approach, which outperforms the existing traditional machine learning algorithms. An average 4 fold accuracy of 74% is obtained with the proposed algorithm. Further, we construct a prognostic risk model for risk estimation of glioma patients. This model reflects the survival of glioma patients, with high risk for low survival patients.
Abstract:Glioblastoma is the most malignant type of central nervous system tumor with GBM subtypes cleaved based on molecular level gene alterations. These alterations are also happened to affect the histology. Thus, it can cause visible changes in images, such as enhancement and edema development. In this study, we extract intensity, volume, and texture features from the tumor subregions to identify the correlations with gene expression features and overall survival. Consequently, we utilize the radiomics to find associations with the subtypes of glioblastoma. Accordingly, the fractal dimensions of the whole tumor, tumor core, and necrosis regions show a significant difference between the Proneural, Classical and Mesenchymal subtypes. Additionally, the subtypes of GBM are predicted with an average accuracy of 79% utilizing radiomics and accuracy over 90% utilizing gene expression profiles.
Abstract:Concept identification is a crucial step in understanding and building a knowledge base for any particular domain. However, it is not a simple task in very large domains such as restaurants and hotel. In this paper, a novel approach of identifying a concept hierarchy and classifying unseen words into identified concepts related to restaurant domain is presented. Sorting, identifying, classifying of domain-related words manually is tedious and therefore, the proposed process is automated to a great extent. Word embedding, hierarchical clustering, classification algorithms are effectively used to obtain concepts related to the restaurant domain. Further, this approach can also be extended to create a semi-automatic ontology on restaurant domain.