TY - JOUR
T1 - Implementing the FAIR Data Principles in precision oncology
T2 - review of supporting initiatives
AU - Vesteghem, Charles
AU - Brøndum, Rasmus Froberg
AU - Sønderkær, Mads
AU - Sommer, Mia
AU - Schmitz, Alexander
AU - Bødker, Julie Støve
AU - Dybkær, Karen
AU - El-Galaly, Tarec Christoffer
AU - Bøgsted, Martin
PY - 2020/5
Y1 - 2020/5
N2 - Compelling research has recently shown that cancer is so heterogeneous that single research centres cannot produce enough data to fit prognostic and predictive models of sufficient accuracy. Data sharing in precision oncology is therefore of utmost importance. The Findable, Accessible, Interoperable and Reusable (FAIR) Data Principles have been developed to define good practices in data sharing. Motivated by the ambition of applying the FAIR Data Principles to our own clinical precision oncology implementations and research, we have performed a systematic literature review of potentially relevant initiatives. For clinical data, we suggest using the Genomic Data Commons model as a reference as it provides a field-tested and well-documented solution. Regarding classification of diagnosis, morphology and topography and drugs, we chose to follow the World Health Organization standards, i.e. ICD10, ICD-O-3 and Anatomical Therapeutic Chemical classifications, respectively. For the bioinformatics pipeline, the Genome Analysis ToolKit Best Practices using Docker containers offer a coherent solution and have therefore been selected. Regarding the naming of variants, we follow the Human Genome Variation Society's standard. For the IT infrastructure, we have built a centralized solution to participate in data sharing through federated solutions such as the Beacon Networks.
AB - Compelling research has recently shown that cancer is so heterogeneous that single research centres cannot produce enough data to fit prognostic and predictive models of sufficient accuracy. Data sharing in precision oncology is therefore of utmost importance. The Findable, Accessible, Interoperable and Reusable (FAIR) Data Principles have been developed to define good practices in data sharing. Motivated by the ambition of applying the FAIR Data Principles to our own clinical precision oncology implementations and research, we have performed a systematic literature review of potentially relevant initiatives. For clinical data, we suggest using the Genomic Data Commons model as a reference as it provides a field-tested and well-documented solution. Regarding classification of diagnosis, morphology and topography and drugs, we chose to follow the World Health Organization standards, i.e. ICD10, ICD-O-3 and Anatomical Therapeutic Chemical classifications, respectively. For the bioinformatics pipeline, the Genome Analysis ToolKit Best Practices using Docker containers offer a coherent solution and have therefore been selected. Regarding the naming of variants, we follow the Human Genome Variation Society's standard. For the IT infrastructure, we have built a centralized solution to participate in data sharing through federated solutions such as the Beacon Networks.
KW - FAIR Data Principles
KW - data sharing
KW - genomics
KW - precision oncology
KW - standards
UR - http://www.scopus.com/inward/record.url?scp=85079073377&partnerID=8YFLogxK
U2 - 10.1093/bib/bbz044
DO - 10.1093/bib/bbz044
M3 - Review article
C2 - 31263868
SN - 1467-5463
VL - 21
SP - 936
EP - 945
JO - Briefings in Bioinformatics
JF - Briefings in Bioinformatics
IS - 3
ER -