Get our free extension to see links to code for papers anywhere online!Free add-on: code for papers everywhere!Free add-on: See code for papers anywhere!

Add to Chrome

Add to Firefox

Add to Edge

Title:Probing Biomedical Embeddings from Language Models

Apr 03, 2019

Qiao Jin, Bhuwan Dhingra, William W. Cohen, Xinghua Lu

Figure 1 for Probing Biomedical Embeddings from Language Models

Figure 2 for Probing Biomedical Embeddings from Language Models

Figure 3 for Probing Biomedical Embeddings from Language Models

Figure 4 for Probing Biomedical Embeddings from Language Models

Share this with someone who'll enjoy it:

Abstract:Contextualized word embeddings derived from pre-trained language models (LMs) show significant improvements on downstream NLP tasks. Pre-training on domain-specific corpora, such as biomedical articles, further improves their performance. In this paper, we conduct probing experiments to determine what additional information is carried intrinsically by the in-domain trained contextualized embeddings. For this we use the pre-trained LMs as fixed feature extractors and restrict the downstream task models to not have additional sequence modeling layers. We compare BERT, ELMo, BioBERT and BioELMo, a biomedical version of ELMo trained on 10M PubMed abstracts. Surprisingly, while fine-tuned BioBERT is better than BioELMo in biomedical NER and NLI tasks, as a fixed feature extractor BioELMo outperforms BioBERT in our probing tasks. We use visualization and nearest neighbor analysis to show that better encoding of entity-type and relational information leads to this superiority.

* NAACL-HLT 2019 Workshop on Evaluating Vector Space Representations for NLP (RepEval)

View paper on

Share this with someone who'll enjoy it:

Title:Probing Biomedical Embeddings from Language Models

Paper and Code