Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Halil Kilicoglu

Use as Directed? A Comparison of Software Tools Intended to Check Rigor and Transparency of Published Work

Jul 23, 2025

Peter Eckmann, Adrian Barnett, Alexandra Bannach-Brown, Elisa Pilar Bascunan Atria, Guillaume Cabanac, Louise Delwen Owen Franzen, Małgorzata Anna Gazda, Kaitlyn Hair, James Howison, Halil Kilicoglu(+10 more)

Abstract:The causes of the reproducibility crisis include lack of standardization and transparency in scientific reporting. Checklists such as ARRIVE and CONSORT seek to improve transparency, but they are not always followed by authors and peer review often fails to identify missing items. To address these issues, there are several automated tools that have been designed to check different rigor criteria. We have conducted a broad comparison of 11 automated tools across 9 different rigor criteria from the ScreenIT group. We found some criteria, including detecting open data, where the combination of tools showed a clear winner, a tool which performed much better than other tools. In other cases, including detection of inclusion and exclusion criteria, the combination of tools exceeded the performance of any one tool. We also identified key areas where tool developers should focus their effort to make their tool maximally useful. We conclude with a set of insights and recommendations for stakeholders in the development of rigor and transparency detection tools. The code and data for the study is available at https://github.com/PeterEckmann1/tool-comparison.

Via

Access Paper or Ask Questions

Multi-label Sequential Sentence Classification via Large Language Model

Nov 23, 2024

Mengfei Lan, Lecheng Zheng, Shufan Ming, Halil Kilicoglu

Figure 1 for Multi-label Sequential Sentence Classification via Large Language Model

Figure 2 for Multi-label Sequential Sentence Classification via Large Language Model

Figure 3 for Multi-label Sequential Sentence Classification via Large Language Model

Figure 4 for Multi-label Sequential Sentence Classification via Large Language Model

Abstract:Sequential sentence classification (SSC) in scientific publications is crucial for supporting downstream tasks such as fine-grained information retrieval and extractive summarization. However, current SSC methods are constrained by model size, sequence length, and single-label setting. To address these limitations, this paper proposes LLM-SSC, a large language model (LLM)-based framework for both single- and multi-label SSC tasks. Unlike previous approaches that employ small- or medium-sized language models, the proposed framework utilizes LLMs to generate SSC labels through designed prompts, which enhance task understanding by incorporating demonstrations and a query to describe the prediction target. We also present a multi-label contrastive learning loss with auto-weighting scheme, enabling the multi-label classification task. To support our multi-label SSC analysis, we introduce and release a new dataset, biorc800, which mainly contains unstructured abstracts in the biomedical domain with manual annotations. Experiments demonstrate LLM-SSC's strong performance in SSC under both in-context learning and task-specific tuning settings. We release biorc800 and our code at: https://github.com/ScienceNLP-Lab/LLM-SSC.

* Accepted by EMNLP 2024

Via

Access Paper or Ask Questions

DiMB-RE: Mining the Scientific Literature for Diet-Microbiome Associations

Sep 29, 2024

Gibong Hong, Veronica Hindle, Nadine M. Veasley, Hannah D. Holscher, Halil Kilicoglu

Figure 1 for DiMB-RE: Mining the Scientific Literature for Diet-Microbiome Associations

Figure 2 for DiMB-RE: Mining the Scientific Literature for Diet-Microbiome Associations

Figure 3 for DiMB-RE: Mining the Scientific Literature for Diet-Microbiome Associations

Figure 4 for DiMB-RE: Mining the Scientific Literature for Diet-Microbiome Associations

Abstract:Motivation: The gut microbiota has recently emerged as a key factor that underpins certain connections between diet and human health. A tremendous amount of knowledge has been amassed from experimental studies on diet, human metabolism and microbiome. However, this evidence remains mostly buried in scientific publications, and biomedical literature mining in this domain remains scarce. We developed DiMB-RE, a comprehensive corpus annotated with 15 entity types (e.g., Nutrient, Microorganism) and 13 relation types (e.g., increases, improves) capturing diet-microbiome associations. We also trained and evaluated state-of-the-art natural language processing (NLP) models for named entity, trigger, and relation extraction as well as factuality detection using DiMB-RE. Results: DiMB-RE consists of 14,450 entities and 4,206 relationships from 165 articles. While NLP models performed reasonably well for named entity recognition (0.760 F$_{1}$), end-to-end relation extraction performance was modest (0.356 F$_{1}$), partly due to missed entities and triggers as well as cross-sentence relations. Conclusions: To our knowledge, DiMB-RE is largest and most diverse dataset focusing on diet-microbiome interactions. It can serve as a benchmark corpus for biomedical literature mining. Availability: DiMB-RE and the NLP models are available at https://github.com/ScienceNLP-Lab/DiMB-RE.

* 13 pages, 2 figures. Please refer to the supplementary material if needed

Via

Access Paper or Ask Questions

BiomedRAG: A Retrieval Augmented Large Language Model for Biomedicine

May 02, 2024

Mingchen Li, Halil Kilicoglu, Hua Xu, Rui Zhang

Figure 1 for BiomedRAG: A Retrieval Augmented Large Language Model for Biomedicine

Figure 2 for BiomedRAG: A Retrieval Augmented Large Language Model for Biomedicine

Figure 3 for BiomedRAG: A Retrieval Augmented Large Language Model for Biomedicine

Figure 4 for BiomedRAG: A Retrieval Augmented Large Language Model for Biomedicine

Abstract:Large Language Models (LLMs) have swiftly emerged as vital resources for different applications in the biomedical and healthcare domains; however, these models encounter issues such as generating inaccurate information or hallucinations. Retrieval-augmented generation provided a solution for these models to update knowledge and enhance their performance. In contrast to previous retrieval-augmented LMs, which utilize specialized cross-attention mechanisms to help LLM encode retrieved text, BiomedRAG adopts a simpler approach by directly inputting the retrieved chunk-based documents into the LLM. This straightforward design is easily applicable to existing retrieval and language models, effectively bypassing noise information in retrieved documents, particularly in noise-intensive tasks. Moreover, we demonstrate the potential for utilizing the LLM to supervise the retrieval model in the biomedical domain, enabling it to retrieve the document that assists the LM in improving its predictions. Our experiments reveal that with the tuned scorer,\textsc{ BiomedRAG} attains superior performance across 5 biomedical NLP tasks, encompassing information extraction (triple extraction, relation extraction), text classification, link prediction, and question-answering, leveraging over 9 datasets. For instance, in the triple extraction task, \textsc{BiomedRAG} outperforms other triple extraction systems with micro-F1 scores of 81.42 and 88.83 on GIT and ChemProt corpora, respectively.

Via

Access Paper or Ask Questions

Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Jun 01, 2023

Sullam Jeoung, Jana Diesner, Halil Kilicoglu

Figure 1 for Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Figure 2 for Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Figure 3 for Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Figure 4 for Examining the Causal Effect of First Names on Language Models: The Case of Social Commonsense Reasoning

Abstract:As language models continue to be integrated into applications of personal and societal relevance, ensuring these models' trustworthiness is crucial, particularly with respect to producing consistent outputs regardless of sensitive attributes. Given that first names may serve as proxies for (intersectional) socio-demographic representations, it is imperative to examine the impact of first names on commonsense reasoning capabilities. In this paper, we study whether a model's reasoning given a specific input differs based on the first names provided. Our underlying assumption is that the reasoning about Alice should not differ from the reasoning about James. We propose and implement a controlled experimental framework to measure the causal effect of first names on commonsense reasoning, enabling us to distinguish between model predictions due to chance and caused by actual factors of interest. Our results indicate that the frequency of first names has a direct effect on model prediction, with less frequent names yielding divergent predictions compared to more frequent names. To gain insights into the internal mechanisms of models that are contributing to these behaviors, we also conduct an in-depth explainable analysis. Overall, our findings suggest that to ensure model robustness, it is essential to augment datasets with more diverse first names during the configuration stage.

Via

Access Paper or Ask Questions

Complementary and Integrative Health Lexicon (CIHLex) and Entity Recognition in the Literature

May 27, 2023

Huixue Zhou, Robin Austin, Sheng-Chieh Lu, Greg Silverman, Yuqi Zhou, Halil Kilicoglu, Hua Xu, Rui Zhang

Abstract:Objective: Our study aimed to construct an exhaustive Complementary and Integrative Health (CIH) Lexicon (CIHLex) to better represent the often underrepresented physical and psychological CIH approaches in standard terminologies. We also intended to apply advanced Natural Language Processing (NLP) models such as Bidirectional Encoder Representations from Transformers (BERT) and GPT-3.5 Turbo for CIH named entity recognition, evaluating their performance against established models like MetaMap and CLAMP. Materials and Methods: We constructed the CIHLex by integrating various resources, compiling and integrating data from biomedical literature and relevant knowledge bases. The Lexicon encompasses 198 unique concepts with 1090 corresponding unique terms. We matched these concepts to the Unified Medical Language System (UMLS). Additionally, we developed and utilized BERT models and compared their efficiency in CIH named entity recognition to that of other models such as MetaMap, CLAMP, and GPT3.5-turbo. Results: From the 198 unique concepts in CIHLex, 62.1% could be matched to at least one term in the UMLS. Moreover, 75.7% of the mapped UMLS Concept Unique Identifiers (CUIs) were categorized as "Therapeutic or Preventive Procedure." Among the models applied to CIH named entity recognition, BLUEBERT delivered the highest macro average F1-score of 0.90, surpassing other models. Conclusion: Our CIHLex significantly augments representation of CIH approaches in biomedical literature. Demonstrating the utility of advanced NLP models, BERT notably excelled in CIH entity recognition. These results highlight promising strategies for enhancing standardization and recognition of CIH terminology in biomedical contexts.

Via

Access Paper or Ask Questions

Commonsense-Aware Prompting for Controllable Empathetic Dialogue Generation

Feb 02, 2023

Yiren Liu, Halil Kilicoglu

Abstract:Improving the emotional awareness of pre-trained language models is an emerging important problem for dialogue generation tasks. Although prior studies have introduced methods to improve empathetic dialogue generation, few have discussed how to incorporate commonsense knowledge into pre-trained language models for controllable dialogue generation. In this study, we propose a novel framework that improves empathetic dialogue generation using pre-trained language models by 1) incorporating commonsense knowledge through prompt verbalization, and 2) controlling dialogue generation using a strategy-driven future discriminator. We conducted experiments to reveal that both the incorporation of social commonsense knowledge and enforcement of control over generation help to improve generation performance. Finally, we discuss the implications of our study for future research.

* Accepted to Workshop on Knowledge Augmented Methods for Natural Language Processing, in conjunction with AAAI 2023

Via

Access Paper or Ask Questions

Developing a Knowledge Graph Framework for Pharmacokinetic Natural Product-Drug Interactions

Sep 24, 2022

Sanya B. Taneja, Tiffany J. Callahan, Mary F. Paine, Sandra L. Kane-Gill, Halil Kilicoglu, Marcin P. Joachimiak, Richard D. Boyce

Figure 1 for Developing a Knowledge Graph Framework for Pharmacokinetic Natural Product-Drug Interactions

Figure 2 for Developing a Knowledge Graph Framework for Pharmacokinetic Natural Product-Drug Interactions

Figure 3 for Developing a Knowledge Graph Framework for Pharmacokinetic Natural Product-Drug Interactions

Figure 4 for Developing a Knowledge Graph Framework for Pharmacokinetic Natural Product-Drug Interactions

Abstract:Pharmacokinetic natural product-drug interactions (NPDIs) occur when botanical natural products are co-consumed with pharmaceutical drugs. Understanding mechanisms of NPDIs is key to preventing adverse events. We constructed a knowledge graph framework, NP-KG, as a step toward computational discovery of pharmacokinetic NPDIs. NP-KG is a heterogeneous KG with biomedical ontologies, linked data, and full texts of the scientific literature, constructed with the Phenotype Knowledge Translator framework and the semantic relation extraction systems, SemRep and Integrated Network and Dynamic Reasoning Assembler. NP-KG was evaluated with case studies of pharmacokinetic green tea- and kratom-drug interactions through path searches and meta-path discovery to determine congruent and contradictory information compared to ground truth data. The fully integrated NP-KG consisted of 745,512 nodes and 7,249,576 edges. Evaluation of NP-KG resulted in congruent (38.98% for green tea, 50% for kratom), contradictory (15.25% for green tea, 21.43% for kratom), and both congruent and contradictory (15.25% for green tea, 21.43% for kratom) information. Potential pharmacokinetic mechanisms for several purported NPDIs, including the green tea-raloxifene, green tea-nadolol, kratom-midazolam, kratom-quetiapine, and kratom-venlafaxine interactions were congruent with the published literature. NP-KG is the first KG to integrate biomedical ontologies with full texts of the scientific literature focused on natural products. We demonstrate the application of NP-KG to identify pharmacokinetic interactions involving enzymes, transporters, and pharmaceutical drugs. We envision that NP-KG will facilitate improved human-machine collaboration to guide researchers in future studies of pharmacokinetic NPDIs. The NP-KG framework is publicly available at https://doi.org/10.5281/zenodo.6814507 and https://github.com/sanyabt/np-kg.

Via

Access Paper or Ask Questions

Discovering novel drug-supplement interactions using a dietary supplements knowledge graph generated from the biomedical literature

Jun 24, 2021

Dalton Schutte, Jake Vasilakes, Anu Bompelli, Yuqi Zhou, Marcelo Fiszman, Hua Xu, Halil Kilicoglu, Jeffrey R. Bishop, Terrence Adam, Rui Zhang

Figure 1 for Discovering novel drug-supplement interactions using a dietary supplements knowledge graph generated from the biomedical literature

Figure 2 for Discovering novel drug-supplement interactions using a dietary supplements knowledge graph generated from the biomedical literature

Figure 3 for Discovering novel drug-supplement interactions using a dietary supplements knowledge graph generated from the biomedical literature

Figure 4 for Discovering novel drug-supplement interactions using a dietary supplements knowledge graph generated from the biomedical literature

Abstract:OBJECTIVE: Leverage existing biomedical NLP tools and DS domain terminology to produce a novel and comprehensive knowledge graph containing dietary supplement (DS) information for discovering interactions between DS and drugs, or Drug-Supplement Interactions (DSI). MATERIALS AND METHODS: We created SemRepDS (an extension of SemRep), capable of extracting semantic relations from abstracts by leveraging a DS-specific terminology (iDISK) containing 28,884 DS terms not found in the UMLS. PubMed abstracts were processed using SemRepDS to generate semantic relations, which were then filtered using a PubMedBERT-based model to remove incorrect relations before generating our knowledge graph (SuppKG). Two pathways are used to identify potential DS-Drug interactions which are then evaluated by medical professionals for mechanistic plausibility. RESULTS: Comparison analysis found that SemRepDS returned 206.9% more DS relations and 158.5% more DS entities than SemRep. The fine-tuned BERT model obtained an F1 score of 0.8605 and removed 43.86% of the relations, improving the precision of the relations by 26.4% compared to pre-filtering. SuppKG consists of 2,928 DS-specific nodes. Manual review of findings identified 44 (88%) proposed DS-Gene-Drug and 32 (64%) proposed DS-Gene1-Function-Gene2-Drug pathways to be mechanistically plausible. DISCUSSION: The additional relations extracted using SemRepDS generated SuppKG that was used to find plausible DSI not found in the current literature. By the nature of the SuppKG, these interactions are unlikely to have been found using SemRep without the expanded DS terminology. CONCLUSION: We successfully extend SemRep to include DS information and produce SuppKG which can be used to find potential DS-Drug interactions.

* 14 pages, 4 tables, 1 figure

Via

Access Paper or Ask Questions

UIUC_BioNLP at SemEval-2021 Task 11: A Cascade of Neural Models for Structuring Scholarly NLP Contributions

May 12, 2021

Haoyang Liu, M. Janina Sarol, Halil Kilicoglu

Figure 1 for UIUC_BioNLP at SemEval-2021 Task 11: A Cascade of Neural Models for Structuring Scholarly NLP Contributions

Figure 2 for UIUC_BioNLP at SemEval-2021 Task 11: A Cascade of Neural Models for Structuring Scholarly NLP Contributions

Figure 3 for UIUC_BioNLP at SemEval-2021 Task 11: A Cascade of Neural Models for Structuring Scholarly NLP Contributions

Figure 4 for UIUC_BioNLP at SemEval-2021 Task 11: A Cascade of Neural Models for Structuring Scholarly NLP Contributions

Abstract:We propose a cascade of neural models that performs sentence classification, phrase recognition, and triple extraction to automatically structure the scholarly contributions of NLP publications. To identify the most important contribution sentences in a paper, we used a BERT-based classifier with positional features (Subtask 1). A BERT-CRF model was used to recognize and characterize relevant phrases in contribution sentences (Subtask 2). We categorized the triples into several types based on whether and how their elements were expressed in text, and addressed each type using separate BERT-based classifiers as well as rules (Subtask 3). Our system was officially ranked second in Phase 1 evaluation and first in both parts of Phase 2 evaluation. After fixing a submission error in Pharse 1, our approach yields the best results overall. In this paper, in addition to a system description, we also provide further analysis of our results, highlighting its strengths and limitations. We make our code publicly available at https://github.com/Liu-Hy/nlp-contrib-graph.

Via

Access Paper or Ask Questions