4 Citations (Scopus)


Prior studies have manually assessed diagnosis codes and found them to be erroneous/incomplete between 4–30% of the time. Previous methods to validate and suggest missing codes from medical notes are limited in the absence of these, or when the notes are not written in English. In this work, we propose using patients’ medication data to suggest and validate diagnosis codes. Previous attempts to assign codes using medication data have focused on a single condition. We present a proof-of-concept study using MIMIC-III prescription data to train a machine-learning-based model to predict a large collection of diagnosis codes assigned on four levels of aggregation of the ICD-9 hierarchy. The model is able to correctly recall 58.2% of the ICD-9 categories and is precise in 78.3% of the cases. We evaluate the model’s performance on more detailed ICD-9 levels and examine which codes and code groups can be accurately assigned using medication data. We suggest a specialized loss function designed to utilize ICD-9’s natural hierarchical nature. It performs consistently better than the non-hierarchical state-of-the-art.
Original languageEnglish
Title of host publicationArtificial Intelligence in Medicine : 18th International Conference on Artificial Intelligence in Medicine, AIME 2020, Minneapolis, MN, USA, August 25-28, 2020, Proceedings
Number of pages11
Publication date26 Sept 2020
ISBN (Print)978-3-030-59136-6
ISBN (Electronic)978-3-030-59137-3
Publication statusPublished - 26 Sept 2020
EventInternational Conference on Artificial Intelligence in Medicine -
Duration: 25 Aug 202028 Aug 2020


ConferenceInternational Conference on Artificial Intelligence in Medicine
Internet address
SeriesLecture Notes in Computer Science


Dive into the research topics of 'Towards Assigning Diagnosis Codes Using Medication History'. Together they form a unique fingerprint.

Cite this