Qualitative analysis of manual annotations of clinical text with SNOMED CT

Jose Antonio Miñarro-Giménez; Catalina Martínez-Costa; Daniel Karlsson; Stefan Schulz; Kirstine Rosenbeck Gøeg

doi:10.1371/journal.pone.0209547

Qualitative analysis of manual annotations of clinical text with SNOMED CT

Jose Antonio Miñarro-Giménez, Catalina Martínez-Costa, Daniel Karlsson, Stefan Schulz, Kirstine Rosenbeck Gøeg

Publikation: Bidrag til tidsskrift › Tidsskriftartikel › Forskning › peer review

11 Citationer (Scopus)

Abstract

SNOMED CT provides about 300,000 codes with fine-grained concept definitions to support interoperability of health data. Coding clinical texts with medical terminologies it is not a trivial task and is prone to disagreements between coders. We conducted a qualitative analysis to identify sources of disagreements on an annotation experiment which used a subset of SNOMED CT with some restrictions. A corpus of 20 English clinical text fragments from diverse origins and languages was annotated independently by two domain medically trained annotators following a specific annotation guideline. By following this guideline, the annotators had to assign sets of SNOMED CT codes to noun phrases, together with concept and term coverage ratings. Then, the annotations were manually examined against a reference standard to determine sources of disagreements. Five categories were identified. In our results, the most frequent cause of inter-annotator disagreement was related to human issues. In several cases disagreements revealed gaps in the annotation guidelines and lack of training of annotators. The reminder issues can be influenced by some SNOMED CT features.

Originalsprog	Engelsk
Artikelnummer	e0209547
Tidsskrift	PLOS ONE
Vol/bind	13
Udgave nummer	12
Antal sider	15
ISSN	1932-6203
DOI	https://doi.org/10.1371/journal.pone.0209547
Status	Udgivet - 1 dec. 2018

Adgang til dokumentet

10.1371/journal.pone.0209547Licens: CC BY 4.0

AUB Link

Søg efter materialet i Aalborg Universitetsbiblioteks søgemaskine

Andre filer og links

http://www.scopus.com/inward/record.url?scp=85059228917&partnerID=8YFLogxK

Citationsformater

@article{ede5a3062f7e43678aeb3a12962e81bc,

title = "Qualitative analysis of manual annotations of clinical text with SNOMED CT",

abstract = "SNOMED CT provides about 300,000 codes with fine-grained concept definitions to support interoperability of health data. Coding clinical texts with medical terminologies it is not a trivial task and is prone to disagreements between coders. We conducted a qualitative analysis to identify sources of disagreements on an annotation experiment which used a subset of SNOMED CT with some restrictions. A corpus of 20 English clinical text fragments from diverse origins and languages was annotated independently by two domain medically trained annotators following a specific annotation guideline. By following this guideline, the annotators had to assign sets of SNOMED CT codes to noun phrases, together with concept and term coverage ratings. Then, the annotations were manually examined against a reference standard to determine sources of disagreements. Five categories were identified. In our results, the most frequent cause of inter-annotator disagreement was related to human issues. In several cases disagreements revealed gaps in the annotation guidelines and lack of training of annotators. The reminder issues can be influenced by some SNOMED CT features.",

author = "Mi{\~n}arro-Gim{\'e}nez, {Jose Antonio} and Catalina Mart{\'i}nez-Costa and Daniel Karlsson and Stefan Schulz and G{\o}eg, {Kirstine Rosenbeck}",

year = "2018",

month = dec,

day = "1",

doi = "10.1371/journal.pone.0209547",

language = "English",

volume = "13",

journal = "PLOS ONE",

issn = "1932-6203",

publisher = "Public Library of Science",

number = "12",

}

TY - JOUR

T1 - Qualitative analysis of manual annotations of clinical text with SNOMED CT

AU - Miñarro-Giménez, Jose Antonio

AU - Martínez-Costa, Catalina

AU - Karlsson, Daniel

AU - Schulz, Stefan

AU - Gøeg, Kirstine Rosenbeck

PY - 2018/12/1

Y1 - 2018/12/1

N2 - SNOMED CT provides about 300,000 codes with fine-grained concept definitions to support interoperability of health data. Coding clinical texts with medical terminologies it is not a trivial task and is prone to disagreements between coders. We conducted a qualitative analysis to identify sources of disagreements on an annotation experiment which used a subset of SNOMED CT with some restrictions. A corpus of 20 English clinical text fragments from diverse origins and languages was annotated independently by two domain medically trained annotators following a specific annotation guideline. By following this guideline, the annotators had to assign sets of SNOMED CT codes to noun phrases, together with concept and term coverage ratings. Then, the annotations were manually examined against a reference standard to determine sources of disagreements. Five categories were identified. In our results, the most frequent cause of inter-annotator disagreement was related to human issues. In several cases disagreements revealed gaps in the annotation guidelines and lack of training of annotators. The reminder issues can be influenced by some SNOMED CT features.

AB - SNOMED CT provides about 300,000 codes with fine-grained concept definitions to support interoperability of health data. Coding clinical texts with medical terminologies it is not a trivial task and is prone to disagreements between coders. We conducted a qualitative analysis to identify sources of disagreements on an annotation experiment which used a subset of SNOMED CT with some restrictions. A corpus of 20 English clinical text fragments from diverse origins and languages was annotated independently by two domain medically trained annotators following a specific annotation guideline. By following this guideline, the annotators had to assign sets of SNOMED CT codes to noun phrases, together with concept and term coverage ratings. Then, the annotations were manually examined against a reference standard to determine sources of disagreements. Five categories were identified. In our results, the most frequent cause of inter-annotator disagreement was related to human issues. In several cases disagreements revealed gaps in the annotation guidelines and lack of training of annotators. The reminder issues can be influenced by some SNOMED CT features.

UR - http://www.scopus.com/inward/record.url?scp=85059228917&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0209547

DO - 10.1371/journal.pone.0209547

M3 - Journal article

C2 - 30589855

SN - 1932-6203

VL - 13

JO - PLOS ONE

JF - PLOS ONE

IS - 12

M1 - e0209547

ER -

Qualitative analysis of manual annotations of clinical text with SNOMED CT

Abstract

Adgang til dokumentet

AUB Link

Andre filer og links

Fingeraftryk

Citationsformater