Scopus Harvesting Series

Contextual Word Embedding for Biomedical Knowledge Extraction: a Rapid Review and Case Study

Dinithi Vithanage, University of Wollongong
Ping Yu, University of Wollongong
Lei Wang, University of Wollongong
Chao Deng, University of Wollongong

Publication Name

Journal of Healthcare Informatics Research

Abstract

Recent advancements in natural language processing (NLP), particularly contextual word embedding models, have improved knowledge extraction from biomedical and healthcare texts. However, limited comprehensive research compares these models. This study conducts a scoping review and compares the performance of the major contextual word embedding models for biomedical knowledge extraction. From 26 articles identified from Scopus, PubMed, PubMed Central, and Google Scholar between 2017 and 2021, 18 notable contextual word embedding models were identified. These include ELMo, BERT, BioBERT, BlueBERT, CancerBERT, DDS-BERT, RuBERT, LABSE, EhrBERT, MedBERT, Clinical BERT, Clinical BioBERT, Discharge Summary BERT, Discharge Summary BioBERT, GPT, GPT-2, GPT-3, and GPT2-Bio-Pt. A case study compared the performance of six representative models—ELMo, BERT, BioBERT, BlueBERT, Clinical BioBERT, and GPT-3—across text classification, named entity recognition, and question answering. The evaluation utilized datasets comprising biomedical text from tweets, NCBI, PubMed, and clinical notes sourced from two electronic health record datasets. Performance metrics, including accuracy and F1 score, were used. The results of this case study reveal that BioBERT performs the best in analyzing biomedical text, while Clinical BioBERT excels in analyzing clinical notes. These findings offer crucial insights into word embedding models for researchers, practitioners, and stakeholders utilizing NLP in biomedical and clinical document analysis.

Open Access Status

This publication is not available as open access

Funding Sponsor

University of Wollongong

Link to Full Text

COinS

Link to publisher version (DOI)

http://dx.doi.org/10.1007/s41666-023-00157-y

Scopus Harvesting Series

Contextual Word Embedding for Biomedical Knowledge Extraction: a Rapid Review and Case Study

Publication Name

Abstract

Open Access Status

Funding Sponsor

Link to publisher version (DOI)

Search

Browse

Links

Scopus Harvesting Series

Contextual Word Embedding for Biomedical Knowledge Extraction: a Rapid Review and Case Study

Authors

Publication Name

Abstract

Open Access Status

Funding Sponsor

Share

Link to publisher version (DOI)

Search

Browse

Links