Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Niall Taylor

Developing Healthcare Language Model Embedding Spaces

Mar 28, 2024

Niall Taylor, Dan Schofield, Andrey Kormilitzin, Dan W Joyce, Alejo Nevado-Holgado

Abstract:Pre-trained Large Language Models (LLMs) often struggle on out-of-domain datasets like healthcare focused text. We explore specialized pre-training to adapt smaller LLMs to different healthcare datasets. Three methods are assessed: traditional masked language modeling, Deep Contrastive Learning for Unsupervised Textual Representations (DeCLUTR), and a novel pre-training objective utilizing metadata categories from the healthcare settings. These schemes are evaluated on downstream document classification tasks for each dataset, with additional analysis of the resultant embedding spaces. Contrastively trained models outperform other approaches on the classification tasks, delivering strong performance from limited labeled data and with fewer model parameter updates required. While metadata-based pre-training does not further improve classifications across the datasets, it yields interesting embedding cluster separability. All domain adapted LLMs outperform their publicly available general base LLM, validating the importance of domain-specialization. This research illustrates efficient approaches to instill healthcare competency in compact LLMs even under tight computational budgets, an essential capability for responsible and sustainable deployment in local healthcare settings. We provide pre-training guidelines for specialized healthcare LLMs, motivate continued inquiry into contrastive objectives, and demonstrates adaptation techniques to align small LLMs with privacy-sensitive medical tasks.

Via

Access Paper or Ask Questions

Bespoke Large Language Models for Digital Triage Assistance in Mental Health Care

Mar 28, 2024

Niall Taylor, Andrey Kormilitzin, Isabelle Lorge, Alejo Nevado-Holgado, Dan W Joyce

Abstract:Contemporary large language models (LLMs) may have utility for processing unstructured, narrative free-text clinical data contained in electronic health records (EHRs) -- a particularly important use-case for mental health where a majority of routinely-collected patient data lacks structured, machine-readable content. A significant problem for the the United Kingdom's National Health Service (NHS) are the long waiting lists for specialist mental healthcare. According to NHS data, in each month of 2023, there were between 370,000 and 470,000 individual new referrals into secondary mental healthcare services. Referrals must be triaged by clinicians, using clinical information contained in the patient's EHR to arrive at a decision about the most appropriate mental healthcare team to assess and potentially treat these patients. The ability to efficiently recommend a relevant team by ingesting potentially voluminous clinical notes could help services both reduce referral waiting times and with the right technology, improve the evidence available to justify triage decisions. We present and evaluate three different approaches for LLM-based, end-to-end ingestion of variable-length clinical EHR data to assist clinicians when triaging referrals. Our model is able to deliver triage recommendations consistent with existing clinical practices and it's architecture was implemented on a single GPU, making it practical for implementation in resource-limited NHS environments where private implementations of LLM technology will be necessary to ensure confidential clinical data is appropriately controlled and governed.

Via

Access Paper or Ask Questions

Efficiency at Scale: Investigating the Performance of Diminutive Language Models in Clinical Tasks

Feb 16, 2024

Niall Taylor, Upamanyu Ghose, Omid Rohanian, Mohammadmahdi Nouriborji, Andrey Kormilitzin, David Clifton, Alejo Nevado-Holgado

Figure 1 for Efficiency at Scale: Investigating the Performance of Diminutive Language Models in Clinical Tasks

Figure 2 for Efficiency at Scale: Investigating the Performance of Diminutive Language Models in Clinical Tasks

Figure 3 for Efficiency at Scale: Investigating the Performance of Diminutive Language Models in Clinical Tasks

Figure 4 for Efficiency at Scale: Investigating the Performance of Diminutive Language Models in Clinical Tasks

Abstract:The entry of large language models (LLMs) into research and commercial spaces has led to a trend of ever-larger models, with initial promises of generalisability, followed by a widespread desire to downsize and create specialised models without the need for complete fine-tuning, using Parameter Efficient Fine-tuning (PEFT) methods. We present an investigation into the suitability of different PEFT methods to clinical decision-making tasks, across a range of model sizes, including extremely small models with as few as $25$ million parameters. Our analysis shows that the performance of most PEFT approaches varies significantly from one task to another, with the exception of LoRA, which maintains relatively high performance across all model sizes and tasks, typically approaching or matching full fine-tuned performance. The effectiveness of PEFT methods in the clinical domain is evident, particularly for specialised models which can operate on low-cost, in-house computing infrastructure. The advantages of these models, in terms of speed and reduced training costs, dramatically outweighs any performance gain from large foundation LLMs. Furthermore, we highlight how domain-specific pre-training interacts with PEFT methods and model size, and discuss how these factors interplay to provide the best efficiency-performance trade-off. Full code available at: tbd.

Via

Access Paper or Ask Questions

Detecting the Clinical Features of Difficult-to-Treat Depression using Synthetic Data from Large Language Models

Feb 12, 2024

Isabelle Lorge, Dan W. Joyce, Niall Taylor, Alejo Nevado-Holgado, Andrea Cipriani, Andrey Kormilitzin

Figure 1 for Detecting the Clinical Features of Difficult-to-Treat Depression using Synthetic Data from Large Language Models

Figure 2 for Detecting the Clinical Features of Difficult-to-Treat Depression using Synthetic Data from Large Language Models

Figure 3 for Detecting the Clinical Features of Difficult-to-Treat Depression using Synthetic Data from Large Language Models

Figure 4 for Detecting the Clinical Features of Difficult-to-Treat Depression using Synthetic Data from Large Language Models

Abstract:Difficult-to-treat depression (DTD) has been proposed as a broader and more clinically comprehensive perspective on a person's depressive disorder where despite treatment, they continue to experience significant burden. We sought to develop a Large Language Model (LLM)-based tool capable of interrogating routinely-collected, narrative (free-text) electronic health record (EHR) data to locate published prognostic factors that capture the clinical syndrome of DTD. In this work, we use LLM-generated synthetic data (GPT3.5) and a Non-Maximum Suppression (NMS) algorithm to train a BERT-based span extraction model. The resulting model is then able to extract and label spans related to a variety of relevant positive and negative factors in real clinical data (i.e. spans of text that increase or decrease the likelihood of a patient matching the DTD syndrome). We show it is possible to obtain good overall performance (0.70 F1 across polarity) on real clinical data on a set of as many as 20 different factors, and high performance (0.85 F1 with 0.95 precision) on a subset of important DTD factors such as history of abuse, family history of affective disorder, illness severity and suicidality by training the model exclusively on synthetic data. Our results show promise for future healthcare applications especially in applications where traditionally, highly confidential medical data and human-expert annotation would normally be required.

Via

Access Paper or Ask Questions

Personalised recommendations of sleep behaviour with neural networks using sleep diaries captured in Sleepio

Jul 29, 2022

Alejo Nevado-Holgado, Colin Espie, Maria Liakata, Alasdair Henry, Jenny Gu, Niall Taylor, Kate Saunders, Tom Walker, Chris Miller

Figure 1 for Personalised recommendations of sleep behaviour with neural networks using sleep diaries captured in Sleepio

Figure 2 for Personalised recommendations of sleep behaviour with neural networks using sleep diaries captured in Sleepio

Figure 3 for Personalised recommendations of sleep behaviour with neural networks using sleep diaries captured in Sleepio

Figure 4 for Personalised recommendations of sleep behaviour with neural networks using sleep diaries captured in Sleepio

Abstract:SleepioTM is a digital mobile phone and web platform that uses techniques from cognitive behavioural therapy (CBT) to improve sleep in people with sleep difficulty. As part of this process, Sleepio captures data about the sleep behaviour of the users that have consented to such data being processed. For neural networks, the scale of the data is an opportunity to train meaningful models translatable to actual clinical practice. In collaboration with Big Health, the therapeutics company that created and utilizes Sleepio, we have analysed data from a random sample of 401,174 sleep diaries and built a neural network to model sleep behaviour and sleep quality of each individual in a personalised manner. We demonstrate that this neural network is more accurate than standard statistical methods in predicting the sleep quality of an individual based on his/her behaviour from the last 10 days. We compare model performance in a wide range of hyperparameter settings representing various scenarios. We further show that the neural network can be used to produce personalised recommendations of what sleep habits users should follow to maximise sleep quality, and show that these recommendations are substantially better than the ones generated by standard methods. We finally show that the neural network can explain the recommendation given to each participant and calculate confidence intervals for each prediction, all of which are essential for clinicians to be able to adopt such a tool in clinical practice.

Via

Access Paper or Ask Questions

Clinical Prompt Learning with Frozen Language Models

May 11, 2022

Niall Taylor, Yi Zhang, Dan Joyce, Alejo Nevado-Holgado, Andrey Kormilitzin

Figure 1 for Clinical Prompt Learning with Frozen Language Models

Figure 2 for Clinical Prompt Learning with Frozen Language Models

Figure 3 for Clinical Prompt Learning with Frozen Language Models

Figure 4 for Clinical Prompt Learning with Frozen Language Models

Abstract:Prompt learning is a new paradigm in the Natural Language Processing (NLP) field which has shown impressive performance on a number of natural language tasks with common benchmarking text datasets in full, few-shot, and zero-shot train-evaluation setups. Recently, it has even been observed that large but frozen pre-trained language models (PLMs) with prompt learning outperform smaller but fine-tuned models. However, as with many recent NLP trends, the performance of even the largest PLMs such as GPT-3 do not perform well on specialized domains (e.g. medical text), and the common practice to achieve State of the Art (SoTA) results still consists of pre-training and fine-tuning the PLMs on downstream tasks. The reliance on fine-tuning large PLMs is problematic in clinical settings where data is often held in non-GPU environments, and more resource efficient methods of training specialized domain models is crucial. We investigated the viability of prompt learning on clinically meaningful decision tasks and directly compared with more traditional fine-tuning methods. Results are partially in line with the prompt learning literature, with prompt learning able to match or improve on traditional fine-tuning with substantially fewer trainable parameters and requiring less training data. We argue that prompt learning therefore provides lower computational resource costs applicable to clinical settings, that can serve as an alternative to fine-tuning ever increasing in size PLMs. Complementary code to reproduce experiments presented in this work can be found at: https://github.com/NtaylorOX/Public_Clinical_Prompt.

* 18 pages, 6 figures, 6 tables

Via

Access Paper or Ask Questions

Rationale production to support clinical decision-making

Nov 15, 2021

Niall Taylor, Lei Sha, Dan W Joyce, Thomas Lukasiewicz, Alejo Nevado-Holgado, Andrey Kormilitzin

Figure 1 for Rationale production to support clinical decision-making

Figure 2 for Rationale production to support clinical decision-making

Figure 3 for Rationale production to support clinical decision-making

Figure 4 for Rationale production to support clinical decision-making

Abstract:The development of neural networks for clinical artificial intelligence (AI) is reliant on interpretability, transparency, and performance. The need to delve into the black-box neural network and derive interpretable explanations of model output is paramount. A task of high clinical importance is predicting the likelihood of a patient being readmitted to hospital in the near future to enable efficient triage. With the increasing adoption of electronic health records (EHRs), there is great interest in applications of natural language processing (NLP) to clinical free-text contained within EHRs. In this work, we apply InfoCal, the current state-of-the-art model that produces extractive rationales for its predictions, to the task of predicting hospital readmission using hospital discharge notes. We compare extractive rationales produced by InfoCal to competitive transformer-based models pretrained on clinical text data and for which the attention mechanism can be used for interpretation. We find each presented model with selected interpretability or feature importance methods yield varying results, with clinical language domain expertise and pretraining critical to performance and subsequent interpretability.

* Machine Learning for Health (ML4H) - Extended Abstract

Via

Access Paper or Ask Questions

Learning to Detect Bipolar Disorder and Borderline Personality Disorder with Language and Speech in Non-Clinical Interviews

Aug 08, 2020

Bo Wang, Yue Wu, Niall Taylor, Terry Lyons, Maria Liakata, Alejo J Nevado-Holgado, Kate E A Saunders

Figure 1 for Learning to Detect Bipolar Disorder and Borderline Personality Disorder with Language and Speech in Non-Clinical Interviews

Figure 2 for Learning to Detect Bipolar Disorder and Borderline Personality Disorder with Language and Speech in Non-Clinical Interviews

Figure 3 for Learning to Detect Bipolar Disorder and Borderline Personality Disorder with Language and Speech in Non-Clinical Interviews

Figure 4 for Learning to Detect Bipolar Disorder and Borderline Personality Disorder with Language and Speech in Non-Clinical Interviews

Abstract:Bipolar disorder (BD) and borderline personality disorder (BPD) are both chronic psychiatric disorders. However, their overlapping symptoms and common comorbidity make it challenging for the clinicians to distinguish the two conditions on the basis of a clinical interview. In this work, we first present a new multi-modal dataset containing interviews involving individuals with BD or BPD being interviewed about a non-clinical topic . We investigate the automatic detection of the two conditions, and demonstrate a good linear classifier that can be learnt using a down-selected set of features from the different aspects of the interviews and a novel approach of summarising these features. Finally, we find that different sets of features characterise BD and BPD, thus providing insights into the difference between the automatic screening of the two conditions.

Via

Access Paper or Ask Questions