Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

David Beymer

Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

Jun 16, 2024

Prashanth Vijayaraghavan, Hongzhi Wang, Luyao Shi, Tyler Baldwin, David Beymer, Ehsan Degan

Figure 1 for Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

Figure 2 for Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

Figure 3 for Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

Figure 4 for Self-Regulated Data-Free Knowledge Amalgamation for Text Classification

Abstract:Recently, there has been a growing availability of pre-trained text models on various model repositories. These models greatly reduce the cost of training new models from scratch as they can be fine-tuned for specific tasks or trained on large datasets. However, these datasets may not be publicly accessible due to the privacy, security, or intellectual property issues. In this paper, we aim to develop a lightweight student network that can learn from multiple teacher models without accessing their original training data. Hence, we investigate Data-Free Knowledge Amalgamation (DFKA), a knowledge-transfer task that combines insights from multiple pre-trained teacher models and transfers them effectively to a compact student network. To accomplish this, we propose STRATANET, a modeling framework comprising: (a) a steerable data generator that produces text data tailored to each teacher and (b) an amalgamation module that implements a self-regulative strategy using confidence estimates from the teachers' different layers to selectively integrate their knowledge and train a versatile student. We evaluate our method on three benchmark text classification datasets with varying labels or domains. Empirically, we demonstrate that the student model learned using our STRATANET outperforms several baselines significantly under data-driven and data-free constraints.

* 12 pages, 5 Figures, Proceedings of NAACL 2024

Via

Access Paper or Ask Questions

VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation

Jun 06, 2024

Prashanth Vijayaraghavan, Luyao Shi, Stefano Ambrogio, Charles Mackin, Apoorva Nitsure, David Beymer, Ehsan Degan

Figure 1 for VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation

Figure 2 for VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation

Figure 3 for VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation

Figure 4 for VHDL-Eval: A Framework for Evaluating Large Language Models in VHDL Code Generation

Abstract:With the unprecedented advancements in Large Language Models (LLMs), their application domains have expanded to include code generation tasks across various programming languages. While significant progress has been made in enhancing LLMs for popular programming languages, there exists a notable gap in comprehensive evaluation frameworks tailored for Hardware Description Languages (HDLs), particularly VHDL. This paper addresses this gap by introducing a comprehensive evaluation framework designed specifically for assessing LLM performance in VHDL code generation task. We construct a dataset for evaluating LLMs on VHDL code generation task. This dataset is constructed by translating a collection of Verilog evaluation problems to VHDL and aggregating publicly available VHDL problems, resulting in a total of 202 problems. To assess the functional correctness of the generated VHDL code, we utilize a curated set of self-verifying testbenches specifically designed for those aggregated VHDL problem set. We conduct an initial evaluation of different LLMs and their variants, including zero-shot code generation, in-context learning (ICL), and Parameter-efficient fine-tuning (PEFT) methods. Our findings underscore the considerable challenges faced by existing LLMs in VHDL code generation, revealing significant scope for improvement. This study emphasizes the necessity of supervised fine-tuning code generation models specifically for VHDL, offering potential benefits to VHDL designers seeking efficient code generation solutions.

* 6 pages, 3 Figures, LAD'24

Via

Access Paper or Ask Questions

Creation and Validation of a Chest X-Ray Dataset with Eye-tracking and Report Dictation for AI Development

Oct 08, 2020

Alexandros Karargyris, Satyananda Kashyap, Ismini Lourentzou, Joy Wu, Arjun Sharma, Matthew Tong, Shafiq Abedin, David Beymer, Vandana Mukherjee, Elizabeth A Krupinski(+1 more)

Figure 1 for Creation and Validation of a Chest X-Ray Dataset with Eye-tracking and Report Dictation for AI Development

Figure 2 for Creation and Validation of a Chest X-Ray Dataset with Eye-tracking and Report Dictation for AI Development

Figure 3 for Creation and Validation of a Chest X-Ray Dataset with Eye-tracking and Report Dictation for AI Development

Figure 4 for Creation and Validation of a Chest X-Ray Dataset with Eye-tracking and Report Dictation for AI Development

Abstract:We developed a rich dataset of Chest X-Ray (CXR) images to assist investigators in artificial intelligence. The data were collected using an eye tracking system while a radiologist reviewed and reported on 1,083 CXR images. The dataset contains the following aligned data: CXR image, transcribed radiology report text, radiologist's dictation audio and eye gaze coordinates data. We hope this dataset can contribute to various areas of research particularly towards explainable and multimodal deep learning / machine learning methods. Furthermore, investigators in disease classification and localization, automated radiology report generation, and human-machine interaction can benefit from these data. We report deep learning experiments that utilize the attention maps produced by eye gaze dataset to show the potential utility of this data.

Via

Access Paper or Ask Questions

Automatic Diagnosis of Pulmonary Embolism Using an Attention-guided Framework: A Large-scale Study

May 29, 2020

Luyao Shi, Deepta Rajan, Shafiq Abedin, Manikanta Srikar Yellapragada, David Beymer, Ehsan Dehghan

Figure 1 for Automatic Diagnosis of Pulmonary Embolism Using an Attention-guided Framework: A Large-scale Study

Figure 2 for Automatic Diagnosis of Pulmonary Embolism Using an Attention-guided Framework: A Large-scale Study

Figure 3 for Automatic Diagnosis of Pulmonary Embolism Using an Attention-guided Framework: A Large-scale Study

Figure 4 for Automatic Diagnosis of Pulmonary Embolism Using an Attention-guided Framework: A Large-scale Study

Abstract:Pulmonary Embolism (PE) is a life-threatening disorder associated with high mortality and morbidity. Prompt diagnosis and immediate initiation of therapeutic action is important. We explored a deep learning model to detect PE on volumetric contrast-enhanced chest CT scans using a 2-stage training strategy. First, a residual convolutional neural network (ResNet) was trained using annotated 2D images. In addition to the classification loss, an attention loss was added during training to help the network focus attention on PE. Next, a recurrent network was used to scan sequentially through the features provided by the pre-trained ResNet to detect PE. This combination allows the network to be trained using both a limited and sparse set of pixel-level annotated images and a large number of easily obtainable patient-level image-label pairs. We used 1,670 sparsely annotated studies and more than 10,000 labeled studies in our training. On a test set with 2,160 patient studies, the proposed method achieved an area under the ROC curve (AUC) of 0.812. The proposed framework is also able to provide localized attention maps that indicate possible PE lesions, which could potentially help radiologists accelerate the diagnostic process.

* MIDL 2020 Full Paper

Via

Access Paper or Ask Questions

Pi-PE: A Pipeline for Pulmonary Embolism Detection using Sparsely Annotated 3D CT Images

Oct 21, 2019

Deepta Rajan, David Beymer, Shafiqul Abedin, Ehsan Dehghan

Figure 1 for Pi-PE: A Pipeline for Pulmonary Embolism Detection using Sparsely Annotated 3D CT Images

Figure 2 for Pi-PE: A Pipeline for Pulmonary Embolism Detection using Sparsely Annotated 3D CT Images

Figure 3 for Pi-PE: A Pipeline for Pulmonary Embolism Detection using Sparsely Annotated 3D CT Images

Figure 4 for Pi-PE: A Pipeline for Pulmonary Embolism Detection using Sparsely Annotated 3D CT Images

Abstract:Pulmonary embolisms (PE) are known to be one of the leading causes for cardiac-related mortality. Due to inherent variabilities in how PE manifests and the cumbersome nature of manual diagnosis, there is growing interest in leveraging AI tools for detecting PE. In this paper, we build a two-stage detection pipeline that is accurate, computationally efficient, robust to variations in PE types and kernels used for CT reconstruction, and most importantly, does not require dense annotations. Given the challenges in acquiring expert annotations in large-scale datasets, our approach produces state-of-the-art results with very sparse emboli contours (at 10mm slice spacing), while using models with significantly lower number of parameters. We achieve AUC scores of 0.94 on the validation set and 0.85 on the test set of highly severe PEs. Using a large, real-world dataset characterized by complex PE types and patients from multiple hospitals, we present an elaborate empirical study and provide guidelines for designing highly generalizable pipelines.

* 2019 NeurIPS ML4H (Proceedings of Machine Learning Research)

Via

Access Paper or Ask Questions

Generalization Studies of Neural Network Models for Cardiac Disease Detection Using Limited Channel ECG

Jan 05, 2019

Deepta Rajan, David Beymer, Girish Narayan

Figure 1 for Generalization Studies of Neural Network Models for Cardiac Disease Detection Using Limited Channel ECG

Figure 2 for Generalization Studies of Neural Network Models for Cardiac Disease Detection Using Limited Channel ECG

Figure 3 for Generalization Studies of Neural Network Models for Cardiac Disease Detection Using Limited Channel ECG

Abstract:Acceleration of machine learning research in healthcare is challenged by lack of large annotated and balanced datasets. Furthermore, dealing with measurement inaccuracies and exploiting unsupervised data are considered to be central to improving existing solutions. In particular, a primary objective in predictive modeling is to generalize well to both unseen variations within the observed classes, and unseen classes. In this work, we consider such a challenging problem in machine learning driven diagnosis: detecting a gamut of cardiovascular conditions (e.g. infarction, dysrhythmia etc.) from limited channel ECG measurements. Though deep neural networks have achieved unprecedented success in predictive modeling, they rely solely on discriminative models that can generalize poorly to unseen classes. We argue that unsupervised learning can be utilized to construct effective latent spaces that facilitate better generalization. This work extensively compares the generalization of our proposed approach against a state-of-the-art deep learning solution. Our results show significant improvements in F1-scores.

* IEEE Computing in Cardiology (CinC) 2018

Via

Access Paper or Ask Questions

Disease Detection in Weakly Annotated Volumetric Medical Images using a Convolutional LSTM Network

Dec 03, 2018

Nathaniel Braman, David Beymer, Ehsan Dehghan

Figure 1 for Disease Detection in Weakly Annotated Volumetric Medical Images using a Convolutional LSTM Network

Figure 2 for Disease Detection in Weakly Annotated Volumetric Medical Images using a Convolutional LSTM Network

Abstract:We explore a solution for learning disease signatures from weakly, yet easily obtainable, annotated volumetric medical imaging data by analyzing 3D volumes as a sequence of 2D images. We demonstrate the performance of our solution in the detection of emphysema in lung cancer screening low-dose CT images. Our approach utilizes convolutional long short-term memory (LSTM) to "scan" sequentially through an imaging volume for the presence of disease in a portion of scanned region. This framework allowed effective learning given only volumetric images and binary disease labels, thus enabling training from a large dataset of 6,631 un-annotated image volumes from 4,486 patients. When evaluated in a testing set of 2,163 volumes from 2,163 patients, our model distinguished emphysema with area under the receiver operating characteristic curve (AUC) of .83. This approach was found to outperform 2D convolutional neural networks (CNN) implemented with various multiple-instance learning schemes (AUC=0.69-0.76) and a 3D CNN (AUC=.77).

* Machine Learning for Health (ML4H) Workshop at NeurIPS 2018 arXiv:1811.07216 Medical Imaging Meets NeurIPS Workshop at NeurIPS 2018

Via

Access Paper or Ask Questions