Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Caylin Hickey

Institutional Platform for Secure Self-Service Large Language Model Exploration

Feb 01, 2024

V. K. Cody Bumgardner, Mitchell A. Klusty, W. Vaiden Logan, Samuel E. Armstrong, Caylin Hickey, Jeff Talbert

Figure 1 for Institutional Platform for Secure Self-Service Large Language Model Exploration

Figure 2 for Institutional Platform for Secure Self-Service Large Language Model Exploration

Figure 3 for Institutional Platform for Secure Self-Service Large Language Model Exploration

Figure 4 for Institutional Platform for Secure Self-Service Large Language Model Exploration

Abstract:This paper introduces a user-friendly platform developed by the University of Kentucky Center for Applied AI, designed to make large, customized language models (LLMs) more accessible. By capitalizing on recent advancements in multi-LoRA inference, the system efficiently accommodates custom adapters for a diverse range of users and projects. The paper outlines the system's architecture and key features, encompassing dataset curation, model training, secure inference, and text-based feature extraction. We illustrate the establishment of a tenant-aware computational network using agent-based methods, securely utilizing islands of isolated resources as a unified system. The platform strives to deliver secure LLM services, emphasizing process and data isolation, end-to-end encryption, and role-based resource authentication. This contribution aligns with the overarching goal of enabling simplified access to cutting-edge AI models and technology in support of scientific discovery.

* 10 pages 11 figures, 5 listings, 4 tables

Via

Access Paper or Ask Questions

Local Large Language Models for Complex Structured Medical Tasks

Aug 03, 2023

V. K. Cody Bumgardner, Aaron Mullen, Sam Armstrong, Caylin Hickey, Jeff Talbert

Abstract:This paper introduces an approach that combines the language reasoning capabilities of large language models (LLMs) with the benefits of local training to tackle complex, domain-specific tasks. Specifically, the authors demonstrate their approach by extracting structured condition codes from pathology reports. The proposed approach utilizes local LLMs, which can be fine-tuned to respond to specific generative instructions and provide structured outputs. The authors collected a dataset of over 150k uncurated surgical pathology reports, containing gross descriptions, final diagnoses, and condition codes. They trained different model architectures, including LLaMA, BERT and LongFormer and evaluated their performance. The results show that the LLaMA-based models significantly outperform BERT-style models across all evaluated metrics, even with extremely reduced precision. The LLaMA models performed especially well with large datasets, demonstrating their ability to handle complex, multi-label tasks. Overall, this work presents an effective approach for utilizing LLMs to perform domain-specific tasks using accessible hardware, with potential applications in the medical domain, where complex data extraction and classification are required.

* 12 pages, Preprint of an article submitted for consideration in Pacific Symposium on Biocomputing \c{opyright} 2024 copyright World Scientific Publishing Company https://www.worldscientific.com/

Via

Access Paper or Ask Questions

Parallelized Interactive Machine Learning on Autonomous Vehicles

Dec 23, 2018

Xi Chen, Caylin Hickey

Figure 1 for Parallelized Interactive Machine Learning on Autonomous Vehicles

Figure 2 for Parallelized Interactive Machine Learning on Autonomous Vehicles

Figure 3 for Parallelized Interactive Machine Learning on Autonomous Vehicles

Figure 4 for Parallelized Interactive Machine Learning on Autonomous Vehicles

Abstract:Deep reinforcement learning (deep RL) has achieved superior performance in complex sequential tasks by learning directly from image input. A deep neural network is used as a function approximator and requires no specific state information. However, one drawback of using only images as input is that this approach requires a prohibitively large amount of training time and data for the model to learn the state feature representation and approach reasonable performance. This is not feasible in real-world applications, especially when the data are expansive and training phase could introduce disasters that affect human safety. In this work, we use a human demonstration approach to speed up training for learning features and use the resulting pre-trained model to replace the neural network in the deep RL Deep Q-Network (DQN), followed by human interaction to further refine the model. We empirically evaluate our approach by using only a human demonstration model and modified DQN with human demonstration model included in the Microsoft AirSim car simulator. Our results show that (1) pre-training with human demonstration in a supervised learning approach is better and much faster at discovering features than DQN alone, (2) initializing the DQN with a pre-trained model provides a significant improvement in training time and performance even with limited human demonstration, and (3) providing the ability for humans to supply suggestions during DQN training can speed up the network's convergence on an optimal policy, as well as allow it to learn more complex policies that are harder to discover by random exploration.

* 6 pages, NAECON 2018 - IEEE National Aerospace and Electronics Conference

Via

Access Paper or Ask Questions