Abstract:When developing clinical prediction models, it can be challenging to balance between global models that are valid for all patients and personalized models tailored to individuals or potentially unknown subgroups. To aid such decisions, we propose a diagnostic tool for contrasting global regression models and patient-specific (local) regression models. The core utility of this tool is to identify where and for whom a global model may be inadequate. We focus on regression models and specifically suggest a localized regression approach that identifies regions in the predictor space where patients are not well represented by the global model. As localization becomes challenging when dealing with many predictors, we propose modeling in a dimension-reduced latent representation obtained from an autoencoder. Using such a neural network architecture for dimension reduction enables learning a latent representation simultaneously optimized for both good data reconstruction and for revealing local outcome-related associations suitable for robust localized regression. We illustrate the proposed approach with a clinical study involving patients with chronic obstructive pulmonary disease. Our findings indicate that the global model is adequate for most patients but that indeed specific subgroups benefit from personalized models. We also demonstrate how to map these subgroup models back to the original predictors, providing insight into why the global model falls short for these groups. Thus, the principal application and diagnostic yield of our tool is the identification and characterization of patients or subgroups whose outcome associations deviate from the global model.
Abstract:In a longitudinal clinical registry, different measurement instruments might have been used for assessing individuals at different time points. To combine them, we investigate deep learning techniques for obtaining a joint latent representation, to which the items of different measurement instruments are mapped. This corresponds to domain adaptation, an established concept in computer science for image data. Using the proposed approach as an example, we evaluate the potential of domain adaptation in a longitudinal cohort setting with a rather small number of time points, motivated by an application with different motor function measurement instruments in a registry of spinal muscular atrophy (SMA) patients. There, we model trajectories in the latent representation by ordinary differential equations (ODEs), where person-specific ODE parameters are inferred from baseline characteristics. The goodness of fit and complexity of the ODE solutions then allows to judge the measurement instrument mappings. We subsequently explore how alignment can be improved by incorporating corresponding penalty terms into model fitting. To systematically investigate the effect of differences between measurement instruments, we consider several scenarios based on modified SMA data, including scenarios where a mapping should be feasible in principle and scenarios where no perfect mapping is available. While misalignment increases in more complex scenarios, some structure is still recovered, even if the availability of measurement instruments depends on patient state. A reasonable mapping is feasible also in the more complex real SMA dataset. These results indicate that domain adaptation might be more generally useful in statistical modeling for longitudinal registry data.