Abstract:There is enormous enthusiasm and concerns in using large language models (LLMs) in healthcare, yet current assumptions are all based on general-purpose LLMs such as ChatGPT. This study develops a clinical generative LLM, GatorTronGPT, using 277 billion words of mixed clinical and English text with a GPT-3 architecture of 20 billion parameters. GatorTronGPT improves biomedical natural language processing for medical research. Synthetic NLP models trained using GatorTronGPT generated text outperform NLP models trained using real-world clinical text. Physicians Turing test using 1 (worst) to 9 (best) scale shows that there is no significant difference in linguistic readability (p = 0.22; 6.57 of GatorTronGPT compared with 6.93 of human) and clinical relevance (p = 0.91; 7.0 of GatorTronGPT compared with 6.97 of human) and that physicians cannot differentiate them (p < 0.001). This study provides insights on the opportunities and challenges of LLMs for medical research and healthcare.
Abstract:Delirium is an acute decline or fluctuation in attention, awareness, or other cognitive function that can lead to serious adverse outcomes. Despite the severe outcomes, delirium is frequently unrecognized and uncoded in patients' electronic health records (EHRs) due to its transient and diverse nature. Natural language processing (NLP), a key technology that extracts medical concepts from clinical narratives, has shown great potential in studies of delirium outcomes and symptoms. To assist in the diagnosis and phenotyping of delirium, we formed an expert panel to categorize diverse delirium symptoms, composed annotation guidelines, created a delirium corpus with diverse delirium symptoms, and developed NLP methods to extract delirium symptoms from clinical notes. We compared 5 state-of-the-art transformer models including 2 models (BERT and RoBERTa) from the general domain and 3 models (BERT_MIMIC, RoBERTa_MIMIC, and GatorTron) from the clinical domain. GatorTron achieved the best strict and lenient F1 scores of 0.8055 and 0.8759, respectively. We conducted an error analysis to identify challenges in annotating delirium symptoms and developing NLP systems. To the best of our knowledge, this is the first large language model-based delirium symptom extraction system. Our study lays the foundation for the future development of computable phenotypes and diagnosis methods for delirium.
Abstract:The risk of diseases such as heart attack and high blood pressure could be reduced by adequate physical activity. However, even though majority of general population claims to perform some physical exercise, only a minority exercises enough to keep a healthy living style. Thus, physical inactivity has become one of the major concerns of public health in the past decade. Research shows that the highest decrease in physical activity is noticed from high school to college. Thus, it is of great importance to quickly identify college students at health risk due to physical inactivity. Research also shows that the level of physical activity of an individual is highly correlated to demographic features such as race and gender, as well as self motivation and support from family and friends. This information could be collected from each student via a 20 minute questionnaire, but the time needed to distribute and analyze each questionnaire is infeasible on a collegiate campus. Thus, we propose an automatic identifier of students at risk, so that these students could easier be targeted by collegiate campuses and physical activity promotion departments. We present in this paper preliminary results of a supervised backpropagation multilayer neural network for classifying students into at-risk or not at-risk group.