Examination of summarized medical records for ICD code classification via BERT

Dilek Aydogan Kilic, Deniz Kenan Kılıç, Izabela Ewa Nielsen

Research output: Contribution to journalJournal articleResearchpeer-review

1 Citation (Scopus)
26 Downloads (Pure)

Abstract

The International Classification of Diseases (ICD) is utilized by member countries of the World Health Organization (WHO). It is a critical system to ensure worldwide standardization of diagnosis codes, which enables data comparison and analysis across various nations. The ICD system is essential in supporting payment systems, healthcare research, service planning, and quality and safety management. However, the sophisticated and intricate structure of the ICD system can sometimes cause issues such as longer examination times, increased training expenses, a greater need for human resources, problems with payment systems due to inaccurate coding, and unreliable data in health research. Additionally, machine learning models that use automated ICD systems face difficulties with lengthy medical notes. To tackle this challenge, the present study aims to utilize Medical Information Mart for Intensive Care (MIMIC-III) medical notes that have been summarized using the term frequency-inverse document frequency (TF-IDF) method. These notes are further analyzed using deep learning, specifically bidirectional encoder representations from transformers (BERT), to classify disease diagnoses based on ICD codes. Even though the proposed methodology using summarized data provides lower accuracy performance than state-of-the-art methods, the performance results obtained are promising in terms of continuing the study of extracting summary input and more important features, as it provides real-time ICD code classification and more explainable inputs.
Original languageEnglish
JournalApplied Computer Science
Volume20
Issue number2
Pages (from-to)60-74
Number of pages15
ISSN2353-6977
DOIs
Publication statusPublished - 30 Jun 2024

Keywords

  • Artificial intelligence
  • Natural language processing (NLP)
  • Classification problem
  • International classification of diseases (ICD)
  • Bidirectional Encoder Representations from Transformers (BERT)
  • MIMIC-III

Fingerprint

Dive into the research topics of 'Examination of summarized medical records for ICD code classification via BERT'. Together they form a unique fingerprint.

Cite this